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Chapter 1 



Differential Calculus in 
Normed Linear Spaces 

We shall recall in this chapter the notions of differentiability in the sense 1 
of Gateaux and Frechet for mappings between normed linear spaces and 
some of the properties of derivatives in relation to convexity and weak 
lower semi-continuity of functionals on normed linear spaces. We shall 
use these concepts throughout our discussions. 

In the following all the vector spaces considered will be over the 
field of real numbers R. 

If V is a normed (vector) space we shall denote by || • \\y the norm 
in V, by V its (strong) dual with || • \\y as the norm and by (•, -)yxv 
the duality pairing between V and V. If V is a Hilbert space then 
(•, -)v will denote the inner product in V. If V and H are two normed 
spaces then «Sf (V, H) denotes the vector space of all continuous linear 
mappings from V into H provided with the norm A — > ||A||.sf(v;#) = 
sup{||Av|| ff /||v|| y ,veV}. 

1 Gateaux Derivatives 

Let V, H be normed spaces and A : U c V — > H be a. mapping of an 
open subset U of V into H. We shall often call a vector <peV, <p + a 
direction in V. 



1 



2 



1 . Differential Calculus in Normed Linear Spaces 



Definition 1.1. The mapping A is said to be differentiable in the sense 
of Gateaux or simply G-differentiable at a point ueU in the direction p> 
if the difference quotient 

(AO + 9p>) - A(u))/6 

2 has a limit A'(u, p) in H as 9 — > in R. The (unique) limit A'(u, p>) is 
called the Gateaux derivative of A at u in the direction p. 

A is said to be G-differentiable in a direction p> in a subset of U if it 
is G-differentiable at every point of the subset in the direction p. 

We shall simply call A'(u, p>) the G-derivative of A at u since the 
dependence on p> is clear from the notation. 

Remark 1.1. The operator V 3 p ^ A'(u, ip)eH is homogeneous: 
A'(u, a, p>) = aA'(u, p) for a > 0. 

In fact, 

A'(u, a, p) = Um(A(u+a9p)-A(u))/6 = a lim(A(u+Ap>))/A = aA'(u,p>). 

— >0 /I — >0 

However, this operator is not, in general, linear as can be seen im- 
mediatly from Example 1 1 .21 below. 

We shall often denote a functional on U by /. 

Remark 1.2. Every lineary functional L : V — > R is G-differentiable 
everywhere in V in all directions and its G-derivative is 

L'(u, p) = L(p) 

since (L(u + 9p) - L(u))/9 = L(p>). It is a constant functional (i.e. inde- 
pendent of u in V). 

If a (u, v) : V x V — > R is a bilinear functional on V then the func- 
tional J : V 3 v h-» /(v) = a(v, v)eR is G-differentiable everywhere in 
all direction and 

J'(u, pi) = a(u, p) + a(p, u). 
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3 If further a(u, v) is symmetric (i.e. a(u, v) = a(v, u) for all u, veV) 

then J'(u, (p) = 2a(u, <p). This follows immediately from bilinearity : 

a(u + 9,u + 6tp) = a(u, u) + 9(a(u, tp) + a(<p, u)) + 8 2 a(<p, <p) 

so that 

J'(u, <p) - lim(7(« + dtp) - J(u))/6 - a(u, tp) + a((p, u). 
$->o 

The following example will be a model case of linear problems in 
many of our discussions in the following chapters. 

Example 1.1. Let (u, v) \-> a(u, v) be a symmetric bi-linear form on a 
Hilbert space V and v h-> L(v)a linear form on V. Define the functional 
J : V -> R by 

J(v) = ja(v,v) - L(v). 

It follows from the above Remark that / is G-differentiable every- 
where in V in all directions ip and 

J'(u,<p) - a(u,(p) -L((p). 

In many of the questions we shall assume: 

(i) a(., .) is (bi-) continuous: there exists a constant M > such that 

a(u,v) < M||w||vl|v||v for all u, veV; 

(ii) a(-, •) is V-coercive; There exists a constant a > such that 

a(v,v) > cr||v|||r for all veV 

and 

(iii) L is continuous: there exists a constant N > such that 

L(v) < JV||v||y for all veV. 

4 



4 1 . Differential Calculus in Normed Linear Spaces 

Example 1.2. The function / : R 2 -> R defined by 

f(x,y) = 



|0 if(jc,y) = (0,0) 

x 5 /((x-j) 2 + x 4 ) if (x,?)* (0,0) 



is G-differentiable everywhere and in all directions. In fact, if u = 
(0, 0)eR 2 then given a direction ip = (X, Y)eR 2 (<p * 0) we have 

(f(6X, BY) - f(0, 0))/6 = 9 2 X 5 /((X - Y) 2 + fix 4 ) 

which has a limit as — > and we have 



|0 ifX^y 
\X ifX = Y 



One can also check easily that / is G-differentiable in R 2 . 

The following will be the general abstract form of functionals in 
amy of the non-linear problems that we shall consider. 

Example 1.3. Let Q. be an open set in R" and V - L P (Q), p > 1. Sup- 
pose g : R 1 3 t h-> g(t)eR l be a C^-function such that 

(0 \g(t)\ < C\t\ p and {if) \g'(t)\<C\tr l 

for some constant C > 0. Then 

u h-> J(u) - I g(u(x))dx 



defines a functional / on L P (Q) = V which is G-differentiable every- 
where in all directions and we have 



T'(u,(p) = \ g'(u 
Jn 



(x))(p(x)dx. 



(The right hand side here exists for any u, (peL p (Q.)). 
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In fact, since ueL p {Q) and since g satisfies (i) we have 

\J(u)\ < I \g(u)\dx < C I \u\ p dx < +00 
Jn Jn 

which means / is well defined on L P (Q). On the other hand, for any 
ueL p (£i) since g' satisfies (ii), g'(u)eL p (O) where p~ l + p'~ l - 1. For, 
we have 

f \g\u)\ p 'dx <C f \u\ (p ~ l)p 'dx = C f \u\ p dx < +00. 
Joj Jn Jn 

Hence, for any u, <peL p (£l), we have by Holder's inequality 



if' 



(u)(pdx 



< llg»lb> ( n)IMb>(n) < C|M^IMI^(n) < +<*>• 

To compute J'(u, <p), if QeR we define /? : [0, 1] \-> R by setting 
h(t) = g(u + t6<p). 

Then heC\0, 1) and 

fc(l) - fc(0) - f h'{t)dt = Gip{x) f g'{u + W<p)dt 
Jo Jo 

(t = t(x)), \t(x)\ < 1 so that 

(J(u + 6<p)- J(u))/6 = f <p(x) f g'(u(x) + t6<p(x))dtdx. 
Jn Jo 

One can easily check as above that the function 
(x, t) ^ <p(x)g'(u(x) + W<p(x)) 
belongs to L l (D. x [0, 1]) and hence by Fubini's theorem 

(J(u + 0<p)- J(u))/8= f dt f <p(x)g'(u(x) + te<p(x))dx. 
Jo Jn 
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Here the continuity of g' implies that 

g'(u + td<p) —> g'(u) as — * (and hence as tO — » 0) 

uniformly for te[0, 1]. Morever, the condition (ii) together with triangle 
inequality implies that, for < < 1. 

|^(x)S>(x) + ^(x))| < C\<p(x)\(\u(x)\ + l^(x)l)"- 1 

and the right side is integrable by Holder's inequality. Then by domi- 
nated convergence theorem we conclude that 

J'(u,(f)= I g'(u)ipdx. 
Jo. 

Definition 1.2. An operator A : U c V — > // (t/ being an open set in 
V) is said to be twice differentiable in the sense of Gateaux at a point 
ueV in the directions <p, if/Qp, \]/eV, tp + 0, \jj + given) if the operator 
u h-> A'(u, if); U c V — > // is once G-differentiable at m in the direction 
i/f. The G-derivative of m h-> A'(m, tp) is called the second G-derivative of 
A and is denoted by A"(u, <p, \]/)eH. 



i.e. A"(u; tp, if/) = lim(A'(w + 9if/, <p) - A'(u, <p))/6. 

— >0 

Remark 1.3. Derivatives of higher orders in the sense of Gateaux can 
be defined in the same way. As we shall not use derivatives of higher 
orders in the following we shall not consider their properties. 

Now let / : U c V — > R be a functional on an open set of a normed 
linear space V which is once G-differentiable at a point ueU. If the 
7 functional <p \-> J'(u,(p) is continuous linear on V then there exists a 
(unique) element G(u)eV such that 

J'(u, <p) - (G(u), (fi)yxv for all tpeV. 

Similarly, if / is twice G-differentiable at a point ueU and if the 
form ((p, i/0 h-> J"(u : <p, ijj) is a bilinear (bi-)continuous form on V x V 
then there exists a (unique) element H(u)eJ£ (V, V) such that 

J"(u;ip,i//) = (H(u)<p,iff)yxv- 



2. Taylor's Formula 
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Definition 1.3. G(u)eV is called the gradient of / at u and //(w)eJz? 
(V, V) is called the Hessian of J at u. 

2 Taylor's Formula 

We shall next deduce the mean value theorem and Taylor's formula of 
second order for a mapping A : U c V — > H (U open subset of a normed 
linear space V) in terms of the G-derivatives of A. We shall begin case 
of functionals on a normed linear space V. 

Let 7 be a functional defined on an open set U in a normed linear 
space V and u, cpeV, <p + be given. Throughout this section we as- 
sume that the set {u + 6<p; 9e[0, 1]} is contained in U. It is convenient to 
introduce the function / : [0, 1] — > R by setting 

9 -> f(Q) = J( U + 6<p). 

We observe that if J'(u + 6<p, cp) exists then / is once differentiable 
in ]0, 1[ and, as one can check immediately 

f(0) = J'(u + 9<p,<p), 

Similarly if J"{u + 8<p, <p, tp) exists then / is twice differentiable and 

f"(6) - J"(u + e<p;<p,(p). 

Proposition 2.1. Let J be afunctional on an open set U of a normed 
space V and ueU, <peV be given. If ' {u + dtp; 6e[0, l]}eU and J is once 
G- differentiable on this set in the direction tp then there exists a #oe]0, 1[ 
such that 

(2.1) J(u + <p) = J(u) + J'(u + 0Q(p, <p) 

Proof. This follows immediately from the classical mean value theorem 
applied to the function / on [0, 1] : thete exists a #oe]0, 1[ such that 

/(l)=/(0) + W(0 O ) 



which is noting nut d2.lt . 



□ 
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Proposition 2.2. Let U be as in Proposition 12.71 If J is twice G - dif- 
ferentiable on the set {u + 9p;9e[0, 1]} in the directions p,p then there 
exists a 9q€\0, 1[ such that 

(2.2) J(u + tp) = J(u) + J'(u, p) + 2^"( M + ^of'i <P> <fi)- 

This again follows from the classical Taylor's formula applied to the 
function / on [0, 1]. 

Remark 2.1. If L : V — > R is a linear functional on V then by Remark 
11.11 is G-differentiable everywhere in all directions and we find that the 
formula d2.lt reads 

Liu + tp) - L{u) + L(<p) 

9 which is noting but additivity of L. 

Similarly, if a(-, •) is a bi-linear form on V then the functional J(v) = 
a(v, v) on V is twice G-differentiable in all pairs directions (tp, if/) and 

J'(u, <p) - a(u, tp) + a(tp, u), J"(u, ip, i/f) = a(if/, p) + a(<p, if/). 

Then the Taylor's formula ( 12.21 ) in this case reads 

a(u + p, u + <p) = a(u, u) + a(u, tp) + a(<p, u) + a(tp, p) 

which is noting but the bilinearity of a. 

These two facts together imply that the functional 

1 

J(v) = 2«(v, v) - L(v) 

of Example 11.11 admits a Taylor expansion of the form (Proposition 12. 21 

1 

J(u + p) - J(u) + a(u, <p) - L(fp) + -a(ip, <p). 

We shall now pass to the case of general operators between normed 
spaces. We remark first of all that the Taylor's formula in the form (12. U 
is not in general valid in this case. However, we have 



2. Taylor's Formula 
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Proposition 2.3. Let V, H be two normed spaces, U an open subset of 

V and let tpeV be given. If the set {u + 9<p;9e[0, 1]} c U and A : U c 

V — > H is a mapping which is G-differentiable everywhere on the set 
\u + 9<p;9e[0, 1]} in the direction <p then, for any geH', there exists a 
9 g e]0, 1[ such that 

(2.3) (g, A(u + ip)) H >xH = (g, A(u)) H 'xh + (g, A\u + 9 g <p, <p)) H >xH 
Proof. We define a function / : [0, 1] — > R by setting 



Then f'(9) exists in ]0, 1[ and 

f\9) = (g,A'(u + 9<p,<p)) H >xH for 9e]0, 1[ 

Now <I2.3I > follows immediatly on applying the classical mean value 
theorem to the function /. 

Proposition 2.4. Let V,H,u,(p and U be as in Proposition MAX If A : 
U c V — » H is G-differentiable in the set {u + Qtp; 9e[0, 1]} in the direc- 
tion (p then there exists a 9q€]0, 1[ such that 



The proof of this proposition uses the following Lemma which is a 
corollary to Hahn-Banach theorem. 

Lemma 2.1. If H is normed space then for any v 6 H there exists a 
geH' such that 



9' ^f(9) = (g,A(u + 9<p)) H 'xH. 



□ 10 



(2.4) 



\\A(u + <p)- A(u)\\ H < \\A'(u + 9 if, <p)\\ H . 



(2.5) 



\\g\\ H ' = 1 and \\v\\ H = (g,v) H xH- 



For a proof see [34]. 
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Proof of Proposition 2.4 The element v = A(u + <p) — A(u) belongs to H 
and let geH' be an element given by the Lemma IXTl satisfying ( I2.5t i.e. 

M\h> = h\\A(u + <p)-A(u)\\ H =<g,A(u + <p)-A(u) > WxH . 

Since A satisfies the assumptions of Proposition 12. 31 it follows that 
there exists a 6q = S g e]0, 1[ such that 

\\A(u + tp)-A(u)\\ H =< g,A{u + <p)-A(u) > H > xH 
=< g,A'(u + 6 ip,ip) > H 'xH 
< \\g\\ H >\\A'(u + ip, ip)\\ H = \\A\u + 8 if, <p)\\ H . 

11 proving i2Al . 

Proposition 2.5. Suppose a functional J : V — > R has a gradient G(u) 
for all ueV which is bounded i.e. there exists a constant M > such that 
||G(m)|| < M for all ueV, then we have 

(2.6) \J(u) - J(v)\ < M\\u - v\\ v for all u, veV. 

Proof. If u, v, eV then taking tp = v - u in Proposition 12 . 1 1 we can write, 
with some 9oe]0, 1[, 

7(v) - J(u) = J'(u + 6q(v - u), v - u) 

-< G(u + Oo(v - u)), v - u >vxv 

and hence 

|7(v) - J(u)\ < \\G(u + 8 Q (v - u))\\ v <\\v - u\\ v < M\\v - u\\ v . 

□ 

3 Convexity and Gateaux Differentiability 

A subset U of a vector space V is convex if whenever u, veil the segment 
{(1 - 9)u + Ov, 8e[0, 1]} joining u and v lies in U. 
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Definition 3.1. A functional 7:[/cV->lona convex set U of a 
vector space V is said to be convex if 

(3.1) /((l - 9)u + 9v) < (1 - 0)J{u) + 9J(v) for all u, veU and 9e[0, 1]. 

7 is said to be strictly convex if strict inequality holds for all u, veV with 
u + v and 6e]0, 1[. 

We can write the inequality (13 .H in the above definition in the equiv- 
alent form 

(3.1)' J(u + 9{v - u)) < J{u) + 9{J{v) - J(u)) for all u, veU and 9e[0, 1]. 

12 

The following propositions relate the convexity of functionals with 
the properties of their G-differentiability 

Proposition 3.1. If a function J : U cV->lonffli open convex set is 
G-dijferentiable everywhere in U in all directions then 

(1) J is convex if and only if 

J(v) > J(u) + J'(u, v - u)for all u, veU. 

(2) J is strictly convex if and only if 

J(v) > J(u) + J'(u, v - u)for all u, veU with u±v. 

Proof. (1) If J is convex then we can write 

J(v) - J(u) > (J(u + 9(v - u)) - J(u))/6 for all 9e[0, 1]. 

Now since J'(u,v - u) exists the right side tends to J'(u,v - u) as 
# — > 0. Thus taking limits as 9 — > in this inequality the required 
inequality is obtained. 

The proof of the converse assertion follows the usual proof in the 
case of functions. Let u, veV and 9e[0, 1]. We have 

J(u) > J(u + 9(v - u)) + J'(u + 9(v - u)), u(u + 9(v - u)) 
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= J(u + 9(v - u)) - 6J'(u + 9(v - u), v - u) 

by the homogeneity of the mapping ip \-> J'(yv, (p) and 

J(v) > J(u + 9{v - u)) + J\u + 9{v - u), v - (u + 6(v - «))) 
= J(u + 9(v - u)) + (1 - 9)J'(u + 9(v -u),v- u). 

Multiplying the two inequalities respectively by (1 - 9) and 9, and 
adding we obtain 

(1 - 9)J(u) + 9J(v) > J{u + 9{v - u)), 

13 thus proving the convexity of J. 

(2) If / is strictly convex we can, first of all, write 

J(v) - J{u) > 9~ l [J(u + 9(v - u)) - J{u)l 

(Here we have used the inequality ( (3.1)'| l). On the other hand, using 
part (1) of the proposition we have 

J(u + 9(v - u)) - J{u) = J\u, 9(v - u)). 

Since, by Remark [T~T1 of Chapter^ J is homogeneous in its second 
argument: i.e. 

J'(u, 9(v - u)) = 9J'(u, v - u). 



This together with the first inequality implies (2). The converse im- 
plication is proved exactly in the same way as in the first part. 

Proposition 3.2. If a functional J : U cV^Ro/ifl/i open convex set 
of a normed space V is twice G-dijferentiable everywhere in U and in all 
directions and if the form (cp, if/) h> J"{u\(p,>j/) is positive semi-definite 
t. e. if 

J"(u : (p,<p) > Ofor all ueU and tpeV with ip ^ 
then J is convex. 



3. Convexity and Gateaux Differentiability 
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If the form {tp, \p) i-> J"{u : tp, \p) is positive definite i.e. if 
J"(u; tp, ip) > for all ueU and <peV with <p + 
then 7 is strictly convex. 

Proof. Since £/ is convex the set {u + 9{v - u), 6e[0, 1]} is contained in U 
whenever u,vell. Then by Taylor's formula (Proposition I2.2t we have, 
with <p = v - u. 

,,,,,, x 1 „, , 

J(v) - J(u) + J (u,v — u) + — J (u + Bq(v — u),v — u,v — u) 

for some #oe]0, 1[. Then the positive semi-definitensess of /" implies 

J(y) > J(u) + J'(u,v - u) 

from which convexity of / follows from (1) of Proposition 13.11 Sim- 14 
ilarly the strict convexity of J from positive definiteness of J" follows 
on application of (2) Proposition 13. II □ 

Now consider the function / : V — > R : 

J(v) = -a(v.v) - L(v) 

of Example ll.il We have seen that J twice G-differentiable and J"(u : 
tp.tp) = a((p,<p). Applying Proposition 13 . 21 we get the 

Corollary 3.1. Under the assumptions of Example U . 1\ J is convex ( resp. 
strictly convex) if a (<p,iff) is positive semi-definite (resp. positive defi- 
nite), i.e. 

J is convex if a((p, tp) > for all tpeV (resp. / is strictly convex if 
a(<p, tp) > for all tpeV with <p ± 0). 

In particular, if a(-, ■) is V-coercive then J is strictly convex. 
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4 Gateaux Differentiability and Weak Lower Semi- 
Continuity 

Let V be a normed vector space. We use the standard notation "v„ — »■ u" 
to denote weak convergence of a sequence v„ in V to u. i.e. For any 
geV we have 

< g, v n >V'xV^< 8, « >V'xV. 

Definition 4.1. A functional J : V — > R is said to be weakly lower 
semi-continuous if for every sequence v n ^w in V we have 

liminf 7(v„) > J{u). 

n—>co 

Remark 4.1. The notion of weak lower semi-continuity is a local prop- 
erty. The Definition 14.11 and the propositions below can be stated for 
15 functionals J defined on an open subset U of V with minor changes. We 
shall leave these to the reader. 

Proposition 4.1. If a functional J : V — * R is convex and admits 
a gradient G(u)eV at every point ueV then J is weakly lower semi- 
continuous. 

Proof. Let v„ be a sequence in V such that v„ — k u in V. Then < 
G(u), v n — u >yxv~ > 0- On the other hand, since J is convex we have, 
by Proposition l3.ll 

J(v n ) > J(u)+ < G(u), v n - u > 

from which on taking limits we obtain 

liminf . J(v n ) > J(u). 
n— »oo 

□ 

Proposition 4.2. If a functional J : V — > R is twice G-differentiable 
everywhere in V in all directions and satisfies 



( i) J has a gradient G{u)eV at all points ueV. 
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(ii) (<p,tf/) h» J"(u;<p,i[/) is positive semi-definite, i.e. J"(u;tp,tp) > 
for all u, <peV with tp + 0, 

then J is weakly lower semi-continuous. 

Proof. By Proposition 13.21 the condition (ii) implies that / is convex. 
Then the assertion follows from Proposition 14. II □ 

We now apply Proposition |^] to the functional 
v h» J(y) = -a(v, v) - L(v) 
of Example ll.il We know that it has a gradient 

G(u) : tp h->< G(u), <p >— a(u, tp) — Lisp) 

and J"(u; tp, tp) = a(tp, tp) for all u, ipeV. 16 

If further we assume that a(; ■) is V-coercive, i.e. there exists an 
a > such that 

(/"(«; tp, <P) =)a(tp, tp) > a\\tp\\ 2 v (> 0) for all tpeV 

then by Proposition 14. 21 we conclude that J is weakly lower semi - con- 
tinuous. 

5 Commutation of Derivations 

We shall admit without proof the following useful result on commuta- 
tivity of the order of derivations. 

Theorem 5.1. Let U be an open set in a normed vector space V and 
J : U c V — > R be afunctional on U. If 

(i) J"(u; tp, <A) exists everywhere in U in all directions tp, tfreV, and 

(ii) for every pair tp, ij/eV the form u i-» J"{u, tp, i]/) is continuous 
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then we have 

J"(u, (f, i/O - J"(u; ft, ip) for all <p, t}/eV. 

For a proof we refer to 

As a consequence we deduce the 

Corollary 5.1. If a functional J : U c V — > R on an open set of a 
normed vector space V admits a Hessian H{u) € Jz?(V, V) at every 
points u e U and if the mapping U s u h > //(«) e Jzf ( V, V) is continu- 
ous then H(u) is self adjoint. 

i.e. < H(u)(p, ip >vxv =< H(u)ijr, <p >vxv far all <p, ijj e V. 

6 Frechet Derivatives 

Let V and H be two normed vector spaces. 

Definition 6.1. A mapping A : U c V — > H from an open set U in V 
to H is said to be Frechet differentiable (or simply F-differentiable) at a 
point u € U if there exists a continuous linear mapping A'(u) : V —* H, 
i.e. A'(u) e ^(V,H) such that 

(6. 1) Mm ||A(n + ip) - A(u) - A'(u)<p\\/\\<p\\ - 0. 

<p— »o 

17 

Clearly, A'{u), if it exists, is unique and is called the Frechet deriva- 
tive (F-derivative) of A at u. 

We can, equivalently, sat that a mapping A : U c V — > H is 
F-differentiable at a point u e U if there exists an element A'(w) e 
&(V;H) such that 

A(m + y>) = A(u) + A'(u)<p + \\<p\\v £ («, where e (w, <p) € H and 
(6.2) 

G (w, — » in // as ^ — > in V. 
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Example 6.1. If / is a function denned in an open set U c R 2 , i.e. 
/ : U — > R, then it is F-differentiable if it is once differentiable in the 
usual sense and 

f'(u) = gmdf(u) = (df ldxi{u\df /dx 2 (u)) e Jz^(R 2 ,R). 

Example 6.2. In the case of the functional 

v i-» J(v) - ^ a ( v > v ) - L(v) 

Of Example 11.11 where (i) and (iii) are satisfied on a Hilbert space V 
we easily check that J is F-differentiable everywhere in V and its F- 
derivative isgiven by 

<p i — > J'(u)(p = a(u, tp) - L(tp). 

In fact, by (i) and (iii) of Example 1 1 1 1 1 J' ( u) e V since tp h-» a(u,tp) 18 
and tp i-» Ujp) are continuous linear and we have 

J(u + <p)- J(u) - [a(u, ip) - L(tp)] - a(tp, <p) - \\<p\\ v e (w, <p) 

where e (u, <p) = \\<p\\y l a((p,<p) and 

0<e (u, (p) < M\\(p\W 

so that e (u, tp) — > as tp — > in V. 

We observe that in this case the ^-derivative of J is the same as the 
gradient of J. 

Remark 6.1. If an operator A : U c V — » is F-differentiable then it is 
also G-differentiable and its G-derivative coincides with its F-derivative. 
In fact, let A be F-differentiable with A'(u) as its F-derivative. Then, for 
u e U, <p e V, tp + 0, writting tp - pip we have tp — > in V as p — > and 

p~ l (A(u + pip) - A(u) - A'(u)ip) 
= p~ l (A(u + (A) - A(u) - A'{u)tp) since A'(u) is linear 
= p -1 ||^|| e (w, ^) = ||^)|| e (w, t/0 — » in H as t/f — > in H i.e. asp — > 0. 
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Remark 6.2. However, in general, the converse is not true. Example ll.2l 
shows that the function / has a G-derivative but not F-differentiable. We 
also note that the G-derivative need not be a linear map of V into H (as 
in Example l 1.2i while the F-derivative is necessarily linear by definition 
and belongs to «5f (V, H). 

Remark 6.3. The notions of F-differentiability of higher orders and 
the corresponding ^-derivatives can be defined in an obvious manner. 
Since, whenever we have F-differentiability we also have G - differ- 
19 entiability the Taylor's formula and hence all its consequences remain 
valid under the assumption of F-differentiability. We shall not therefore 
mention these facts again. 

7 Model Problem 

We shall collect here all the results we have obtained for the case of the 
functional 

v i-> J(y) = 2 fl ( v ' v ) _ ^( v ) 

on a HUbert space V satisfying conditions (i), (ii) and (iii) of Example 
11.11 This contains, as the abstract formulation, most of the linear elliptic 
problems that we shall consider except for the case of non-symmetric 
elliptic operators. 

(1) J is twice Frechet differentiable (in fact, F-differentiable of all 
orders) and hence is also Gateaux differentiable. 

J'(u, (f) = a{u, (p) - L(<p) and J"{u; ip, = a((p, 4>). 

J has a gradient and a Hessian at every point u eV 

G{u) = (gradJ)(u) : <p h» a(u,cp) - L(tp). 

Moreover, H(u) is self-adjoint since a(tp, \fj) - a(ifr, <p) for all <p, 
iJ/eV. 
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(2) Taylor's formula for J If u, v, e V then 

7(v) = J(u) + {a(u, v - u) - L(v - u)} + -^a{v -u,v — u) 

(3) Since the mapping v h-> a(w, v) for any h e Vis continuous linear 
and L e V, by the theorem of Frechet-Riesz on Hilbert spaces 
there exist (unique elements Au, f e V such that 

a(u, v) - (Au, v)v and L(v) - (/, v)y for all v e V 

Clearly A : V — > V is a continuous linear map. Moreever we have 20 

IIAILswv) <Mby (0, 
(Av, v)v > ffllvlly for all v e V by (ii) and 

WfWv < N. 



(4) The functional / is strictly convex in V. 

(5) J is weakly lower semi-continuous in V. 



Chapter 2 

Minimisation of Functionals 
- Theory 

In this chapter we shall discuss the local and global minima of func- 21 
tionals on Banach spaces and give some sufficient conditions for their 
existence, relate them to conditions on their G-derivatives (when they 
exist) and convexity properties. Then we shall show that the problem of 
minimisation applied to suitable functionals on Sobolev spaces lead to 
and equivalent to some of the standard examples of linear and non-linear 
elliptic boundary value problems. 

1 Minimisation Without Convexity 

Let U be a subset of a normed vector space V and J : H c V — > R be a 
functional. 

Definition 1.1. A funvtional / : 11 c V — > R is said to have a local 
minimum at a point ueK if there exists a neighbourhood Yiu) of u in V 
such that 

J(u) < J(v) for all v^U n Y(u) 
Definition 1.2. A functional / on K is said to have a global minimum 
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(or an absolute minimum) in 11 if there exist a ueU such that 

/(h) < /(v) for all ve<Z/. 
We have the following existence result. 

Theorem 1.1. Suppose V, 11 and J : K — > R satisfy the following 
hypothesis : 

(HI) V is a reflexive Banach space, 
(H2) 11 is weakly closed. 
22 (H3) H is bounded and 

(H4) J : 11 c V — > R weakly lower semi-continuous. 
Then J has a global minimum in 14. 

Proof. Let ( denote inf /(v). If v„ is a minimising sequence for /, i.e. 

veli 

I = inf /(v) = lim /(v„), then by the boundedness of 11 (i.e. by H3) v n 

v^U n—>co 

is a bounded sequence in V i.e. there exists a constant C > such that 
||v„|| < C for all n. By the reflexivity of V(H\) this bounded sequence is 
weakly relatively compact. So there is a subsequence v n > of v n such that 
v n > ^ u in V. If being weakly closed (H2) ue^U. Finally, since v n > — ^ w 
and / is weakly lower semi-continuous 

J(u) < lim inf J(v n ') 

n—>oo 

which implies that 

/(h) < lim J(v n >) = l< J(v) for all ve*ZY. 

n— >oo 

□ 

Theorem 1.2. # K'ZY and J satisfy the Hypothesis (HI), (H2), (HA) 
and J satisfies 

(H?,)' lim J(v) - +oo 

IMIv-»+«> 

then J admits a global minimum in 11. 
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Proof. We shall reduce the problem to the previous case. Let w e H be 
arbitrary fixed. Consider the subset Ho of H : 

K = {v;ve<U such that J(v) < J(w)}. 

□ 

It is immediatly seen that the existence of a minimum in Ko is equiv- 
alent to that in 11. We claim that Hq is bounded and weakly closed in 
V. i.e. hypothesis (H2) and (H3) hold for Uq. In fact, suppose 1Iq is 
not bounded then we can find a sequence v n € Hq with ||v„||y — > +oo. 23 
The, by (H3)',J(v n ) — > +oo which is impossible since v„ e 1/o im- 
plies that J(v n ) < J(w). Hence Ho is bounded. To prove that Hq is 
weakly closed, let u n e Uo be a sequence that u n — »■ « in V. Since 
is weakly closed u € 11. On the other end, since 7 is weakly lower 
semi-continuous w„ — m in V implies that 

J {u) < liminf J(u n ) < J(w) 

proving that u € 14q. Now Hq and J satisfy all the hypothesis of Theo- 
rem ll.ll and hence J has a global minimum in Uq and hence in H. 

Next we give a necessary condition for the existence of a local min- 
imum in items of the first G-derivative (when it exists) of the functional 
J. For this we need the following concept of admissible (or feasible) 
directions at a points u for a domian 1i in V. It u, v e V u + v then the 
nonzero vector v - u can be consider as a direction in V. 

Definition 1.3. (1) A direction v - u in V is said to be a strongly admissi- 
ble direction at the points u for the domian 1i if there exists a sequence 
e n > such that 

e n — > as n — > oo and u + e n (y - u) e 11 for each n. 

(2) A direction v - u in V is said to be weakly admissible at the points u 
for the domian H if there exist sequence e n > and w n e V such that 



e n — > and w„ — > in V, u n + e n (v - u) + e„w„ e 1/ for each n. 
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We shall mainly use the notion of strongly admissible direction. But 
some results on minimisation of functionals are known which make use 
of the notion of weakly admissible directions. 
24 We have the following necessary condition for the existence of a 

local minimum. 

Theorem 1.3. Suppose afunctional J : 11 c V — > R has a local mini- 
mum at a point uel/ and is G-differentiable at u in all directions then 
J'(u, v - u) > Ofor every v e V such that v - u is a strongly admissible 
direction. 

Furtheremore, if 11 is an open set then 

J'(u, ip) = Ofor all <peV. 

Proof. If u e 11 is local minimum for / then there exists a neighbour- 
hood "V(u) of u in V such that 

J(u) < J(w) for all w e OA n V(u). 

□ 

If v e V and v - u is a strongly admissible direction then, for n large 
enough, 

u + e n (v -ujeUn Y{u) 

so that 

J(u) < J(u + e n (v - u)). 

Hence 

J'(u, v - u) = lim (J(u + e n (yiu)) - J(u))/e n > 0. 

Finally, if H is an open set in V then 11 contains an open ball in V 
of centre u and hence every direction is strongly admissible at u for H. 
Taking v - u ± (p, (p eV it follows from the first part that 

J'(u, ±<p) > or equivalently J'(u, <p) - for all (p e V. 
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and if u is a local minimum then 

J'(u, <p) =< G(u), tp >vxv= for all <p eV; i.e. G(u) = e V' . 

This result is thus in conformity with the classical case of differen- 
tiable functions. 

Remark 1.1. The converse of Theorem 1 1 . 3 I req uires convexity assump- 
tions as we shall see in the following section. 

2 Minimistion with Convexity conditions 

We shall show that under convexity assumptions on the domian 1A and 
the functional J the notions of local and global minima coincide. We 
also give another sufficient condition for the existence of minima. 

Lemma 2.1. If HA is a convex subset of a normed vector space V and 
J : 14 c V — » K. is a convex functional then any local minimum is also 
a global minimum. 

Proof. Suppose u e U is a local minimum of J. Then there is a neigh- 
bourhood y(u) of u in V such that 

J{u) < J{v) for all v 6 V(u) n H. 

On the other hand, if v e <U then u + <9(v - u) € <U for all 6» € [0, 1] 
by convexity of tl. □ 

Moreover, if 9 is small enough, say < 6 < 6 V then u + 6(v - u) e 
Y{u). Hence 

J(u) < J(u + G(v - u)) for all < 6 < G v 

< J(u) + 8(J(v) - J(u)) by convexity of J, for all < G < G v , 

which implies that 

J(u) < J(v) for all v 6 <U. 

26 
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Whenever the assumptions of Lemma |2~T1 are satisfied we shall call 
a minimum without reference to local or global. Next lemma concerns 
the uniqueness of such a minimum. 

Lemma 2.2. If 11 is a convex subset of a normed vector space and 
J : U c V — > R is strictly convex then there exixts a unique minimum 
u € 11 for J. 

Proof. The existence is proved in Lemma l2~Tl To prove the uniqueness, 
if wi + U2 are two minima for / in 11 then 

/(bi) - J(u 2 ) < /(v) for all veil 

and, in particular, this holds for v which belongs to 1/ since 

11 is convex. On the other hand, since J is strictly convex 

/(^"l + ^«2) < ^/(«l) + ^/(«2) = /(«1 < /(V)) 

which is impossible if we take v - |(«i+W2). This proves the uniqueness 
of the minimum. □ 

We shall now pass to a sufficient condition for the existence of min- 
ima of functionals which is the exact analogue of the case of twice dif- 
ferentiable functions. 

Theorem 2.1. Let J : V — > R be a functional on V,U a subset of V 
satisfying the following hypothesis : 

(HI ) V is a relexive Banach space; 

(H2) J has a gradient G(u) e V everywhere in 11; 

(H3) J is twice G-differentiable in all directions <p,if/ e V and satisfies 
the condition 

J"(u;<p,<p) > \\<p\\vX(\\<P\\v)for all <p e V, 

27 where t h-» x(t) is afunction on {t e R; f > 0} such that 

X(t) > and lim %(t) = +oo; 
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(H4) is a closed convex set. 

Then there exists at least one minimum u eU of J. Furthermore, 
if in (H3) 

(H5) 

X (t)>Ofort>0 
is satisfied by x then there exists a unique minimu of J in H. 

Remark 2.1. We note that a convex set 1A is weakly if and only if it 
is strongly closed and thus in {HA) above 1i may be assumed weakly 
closed. 

Proof of Theorem 2.1. First of all by (H3), J"(u; <p,<p)>0 and hence J 
is convex by Proposition 1 113 .21 Similarly (H5) implies that / is strictly 
convex again by Proposition ffl 13.21 Then, by Proposition ffl 14.21 (H2) 
and (H3) together imply that J is weakly lower semi-continuous. We 
next show that J satisfies condition (H3)' of Theorem 11.21 namely 
7(v) — > +oo as ||v||y — > +oo. For this let w e *W be arbitrarily fixed. 
Then, because of (H2) and (H3) we can apply Taylor's formula to get, 
for v e V. 

J(v) - J(w)+ < G(w), v - w >vxv +2^"( w + ^o( v _ w), v — w, v — w) 

for some #o e ]0> 1[. Using (H3) and estimating the second and third 
terms on the right side we have 

| < G(w), v-w >y xV I < l|G(w)|Hlv - wily 

J"(w + 9o(v - w), v - w, v - w) > \\v - w\\v x (||v - w||v) and hence 

J(V) > J(W) + \\V - W\\ v [j X (||V - W||y) - ||G(W)||V']. 

Here, since w € 1A is fixed, as ||v||y — » +oo 28 

||v - w||y — » +oo, 
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J(w) and ||G(w)||v are constants and 

X(\\v ~ w||y) -> +~ by (A3) 

which implies that J(v) — > +00 as ||v||v — > +°°. The theorem then 
follows on application of Theorem ll.2l 

Theorem 2.2. Suppose H is a convex subset of a Banach space and J : 
11 c V — » R is a G-differentiable (in all directions) convex functional. 
Then 

u ell is a minimum for J (i.e. J(u) < J(v)for alive V) if and only 

if 

u £ H and J'(u, v — u) > Ofor all v e 11. 

Proof. Let u etfbea. minimum for J. Then, since 11 is convex, v - u is 
a strongly admissible direction at u for 11 for any v. Then, by Theorem 
11.31 J'(u,v - u) > for any veTY. Conversely, since 7 is convex and 
G-differentiable, by part (1) of Proposition^ 13.11 we find that 

J(v) > J(u) + J'(u, v - u) for any ve^U. 

□ 

Then using the assumption that J'(u;v-u) > it follows that J(u) < 
J(v) i.e. u is a minimum for J in 1A. 

Our next result concerns minima of convex functionals in the non- 
differentaible case. 

Theorem 2.3. Let 11 be a convex subset of a Banach space V. Suppose 
J : 11 c V — > R i s a functional of the form J = J\ + J2 where J\ , Ji 
29 are convex functionals and J 2 is G-differentiable in 11 in all directions. 
Then ue r U is a minimum for J if and only if 

u£li, J\(v) - J\(u) + J' 2 (u, v - u) > for all veil 

Proof. Suppose ueU is a minimum of J then 

J(u) - J\(u) + J 2 (u) < J\(u + 6(y - u)) + J 2 (u + 0(v - «)) 
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since u + 9(v - u)^U. Here, by convexity of J\, we have 
Ji(u + 0(v - u)) < Jxiu) + 9(Ji(v) - 

so that 

J 2 (u) < 6»(7i(v) - Ji(u)) + J 2 (u + 9(v - u)). 

That is 

/i(v) - J Y {u) + (J 2 (u + 9(v - u)) - J 2 (u))/9 > 0. 

□ 

Taking limits as 9 — > we get the required assertion. Conversely, 
since J 2 is convex and is G-differentiable we have, from part (1) of 
Proposition [l] 13.11 

J 2 (v) - J 2 (u) > J' 2 (u, v - u) for all u, ve^U. 

Now we can write, for any v^tl, 

J(v) - J{u) = 7i(v) - J { (u) + J 2 (v) - J u 

> Ji(v) - J\(u) + J' 2 (u, v - u) > 

by assumption which proves that u^U is a minimum for /. 

3 Applications to the Model Problem and Reduc- 
tion to variational Inequality 

We shall apply the results pf Section |2 to the functional J of Example 30 
n II- H on a Hilbert space. More precisely, let V be a Hilbert space and 
/ : V -> R be the functional 

v i-» J{v) = v) - L(v) 

where a(-, •) is a symmetric bilinear, bicontinuous, coercive form on V 
and LeV . Further, let K be a closed convex subset of V. Consider the 
following 
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Problem 3.1. To find 

ueK; J{u) < J(v) for all veK. 

i.e. to find a ueK which minimizes J on K. We have seen in Chapter ^ 
(Section0> that J is twice F-(and hence also G-) differentiable and that 

J'(u, tp) =< G(u), tp >vxv= a{u, ip) - L((p) 

J"(u; ip, if/) =< H(u)ip, if/ >vxv= ®(<P' 'A) 
Moreover, the coercivity of a(-, •) implies that 

J"(u;(p,ip) = a((p,(p) > a\\<p\\ 2 v . 

If we choose x(t) = at then all the assumptions of Theorem 12. 1 1 are 
satisfied by V, J and K so that the Problem 13.11 has a unique solution. 
Also, by Theorem l2.2l the problem l3~T1 is equivalent to 

Problem 3.2. To find 

ueK; a(u, v — u) > L(v — u) for all veK. 

We can summarise these facts as 

31 Theorem 3.1. (I) There exists a unique solution ueK of the Problem \3.1\ 

and 

(2) Problem \3.1\ is equivalent to problem I J. 21 

The problem 13.21 is called a variational inequality associted to the 
closed convex set K and the bilinear form a(-, •)■ As we shall see in 
the following section the variational inequality d3-2l > arises as general- 
izations of elliptic boundary value problems for suitable elliptic oper- 
ators. It turns out that in many of the problems solving (numerically) 
the minimisation problem |3~T1 is much easier and faster than solving the 
equivalent variational inequality (13.2b . 

In the particular case where K = V the Problme l3.1l is nothing but 
the Problem 

(3.3) to find ueV; J(u) < J(v) for all veV 
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which is equivalent to the Problem 

(3.4) to find ueV; a(u, <p) = IXjf) for a tpeV. As we have seen in 
ChapterfTJ (3.4) is equivalent to (13.21) : if cpeV we take v = u ± (peK = V 
in (13.21 1 to get (3.4) and the converse is trivial. 

The following result is a generalization of Theorem 13.11 to non- 
symmetric case and is due to G-Stampacchia. This generalizes and in- 
cludes the classical Lax-Milgram theorem. (See [43]). 

Theorem 3.2. ( Stampacchia ). Let K be a closed convex subset of a 
Hilbert space V and a(-, •) be a bilinear bicontinuous coercive form on 
V. Then for any given LeV the variational inequality \3.2\ has a unique 
solution ueK. 

Proof. Since, for any m,vh a(u, v) is continuous linear on V and LeV 
there exist unique elements Au, feV by Frechet-Riesz theorem such that 

a(u,v) - (Au,v)v and L(v) = (/,v)y. 

□ 32 

Moreover Ae^f(V, V) with HAU^y) < M and ||/||v < N where 
M > 0, ./V > are constants such that 

\a(u, v)| < M |[w||vl|v||v for all u, veV, 
\L(v)\ < N\\v\\ v for all veV. 

Let a > be the constant of V-coercivity of a(-, •) i.e. 

a(v,v) > a\\v\\ 2 v for all veV. 

Since K is a closed convex set there exists a projection mapping 
P : V — > K with |[P[|jf(v,y) < 1. Let y > be a constant which we shall 
choose suitably later on. Consider the mapping 

Vbv^v- y(Av -f) = T y (v)eV. 

For y sufficiently small T y is a contraction mapping. In fact, if 
vi,V2eV then 

TyVi - T y v 2 - (I - yA)(vi - v 2 ). 
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Setting w -v\ - V2 we have 

- yA)w||y = (w - yAw, w - yAw)v 

= IMIy - y[(w,Aw) v + (Aw,w) v ] + y 2 ||Aw||y 
<\\w\\ 2 v -2ya\\w\\ 2 v + y 2 M 2 \\w\\ 2 v 
= (l-2ya + y 2 M 2 )\\w\\ 2 v 

by V-coercivity and continuity of the operator A. It is easy to see that 
if < y < 2a I M 2 then 1 - 2ya + y 2 M 2 < 1 and hence T y becomes 
33 a contraction mapping. Then the mapping PT 7 \k '■ K — > K is a con- 
traction mapping and hence has a unique fixed point ueK by contraction 
mapping theorem i.e. 

ueK and u - P(u - y(Au - /)). 

This is the required solution of the variational inequality \1>.2\ as can 
easily be checked. 

4 Some Functional Spaces 

We shall briefly recall some important Sobolev spaces of distributions 
on an open set in R" and some of their properties. These spaces play 
an important role in the weak (or variational) formulation of elliptic 
problems which we shall consider in the following. All our functionals 
in the examples will be defined on these spaces. For details we refer to 
the book of Lions and Magenes 13*21 . 

Let O be a bounded open subset in R" and T denote its boundary. We 
shall assume T to be sufficiently "regular" which we shall make precise 
whenever necessary. 

Sobolev spaces. We introduce the Sobolev space H l (Q.): 
(4.1) H l (Q) - {v\veL 2 (Q.),dxjeL 2 (Q.),j = 1,- •• ,n] 

where D;v = dv/dx/ are taken in the sense of distributions 



i.e. 



< Djv, ip >= 



- < v, Djtp > for all ipeS>{Q) 
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Here £}(Q) denotes the space of all C°° -functions with compact 
support in Q. and < •, • > denotes the duality between and the 

space of distributions *2)'{Q.) on O. H l {Q) is provided with the inner- 
product 



for which becomes a Hilbert space. The following inclusions are obvi- 
ous (and are continuous) S)(Q.) c C l {Q) c H l {Q). 
We also introduce the space 



We ahve the following well-known results. 

(4.4) Theorem of Density: If T is "regular" (for instance, T is a C 1 (or 
C°°)-mainfold of dimension n - 1) then C l (Q.) (resp. C°°(£T)) is dense in 
H l (Q.). 

(4.5) Theorem of Trace. If T is "regular" then the linear mapping v h-> 
v/r of C l (Q.) -> C^r) (resp pf C°°(Q) -> C°°(r)) extends to a continu- 
ous linear map of H l (Q.) into L 2 (T) denoted by y and for any veH l (Q.) 
yv is called the trace of v on T. Moreover, Hq(Q.) = {veH 1 (co)yv = 0}. 
We shall more often use this characterization of H^(Q.). The trace map is 
not surjective. For a characterization of the image of H l (co) by y (which 
is proper subspace, denoted by Hi(T)) we refer to the book of Lions 
and Magenes [32]. We can also define spaces H m (Q.) and H™{Q) in the 
same way for any m > 1 . 

Remark 4.1. The Theorem of trace is slightly more precise than our 
statement above. For this and also for a proof we refer to the book of 
Lions and Magenes [32]. 

For some non-linear problems we shall also need spaces of the form 
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(4.2) 




(4.3) 



fljj(fl) - the closure of S>{Q) in H\Q). 



(4.6) 



V = ffUQ) n L P (Q) where p > 2. 
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The space V is provided with the norm 



v i-> ||v||v = IMI#i ( n) + l|v||LP(n) 



for which it becomes a Banach space. If 2 < p < +oo then V is a 
35 reflexive Banach space. 

In order to given an interpretation of the solutions of weak formula- 
tions of the problems as solutions of certain differential equations with 
boundary conditions we shall need an extension of the classical Green's 
formula which we recall here. 

(4.8) Green's formula for Sobolev spaces. Let Q be a bounded open 
set with sufficiently "regular" boundary T. Then there exists a unique 
outer normal vector n(x) at each point x on T. Let (n { (x), • • ■ ,n n (x)) 
denote the direction cosines of n(x). We define the operator of exterior 
normal derivation formally as 



where dcr is the area element on Y. This formual remains valid also if 
u,veH l (£l) in view of the trace theorem and density theorem as can be 
seen using convergence theorems. 

Next if u, veC 2 (Q), then applying the above formula to Dju, Djv and 
summing over j - 1 , • • • .n we get 



n 



(4.9) 




Now if u, veC 



(Q.) then by the classical Green's formula we have 





n n 




n 




(4.10) i.e. 



(Au)vdx + I du/dn.vdcr. 
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Once again this formula remains valid if ; for instance, ueH 2 {Q), 
using the density and trace theorems. In fact, ueH 2 (Q) implies 
that AueL 2 (D.) and since DjueH l {£l), y(Dju) exists and belong to L 2 (T) 
so that du/dn = YT j=x n jy {D jU )eL 2 {Y). 

5 Examples 

In this section we shall apply results of the previous sections to some 36 
concrete example of functional on Sobolev spaces and we interprete 
the corresponding variational inequalities as boundary value problems 
for differential operators. 

Throughout this section O will be a bounded open set with suffi- 
ciently "regular" boundary T. We shall not make precise the exact regu- 
larity conditions on Y except to say that it is such that the trace, density 
and Green's formula are valid. 

We begin with the following abstract linear problem. 

Example 5.1. Let r = Y\ U Y2 where Yj are open subsets of Y such that 
Ti n Y2 = (f> Consider the space 

(5.1) y = {v|v6// 1 (^);rv = 0onri}. 

V is clearly a closed subspace of H l (£l) and is provided with the 
inner product induced from that in H and hence it is a Hilbert space. 
Moreover, 

(5.2) #i(Q) c V c H\0) 

and the inclusions are continuous linear. If feL 2 (Q) we consider the 
functional 

(5-3) /(v) = i((M,v))-(/,v) L 2 (n) 

i.e. a(u, v) - ((u,v)) and L(v) = (/, v) £ 2 (n) . Then a(-, ■) is bilinear, 
bicontinuous and V-coercive : 



\a(u,v)\ < ||M||y||v||y - IMItfi (£ -2)IMI//i(n) for u,veV, 
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a(v,v) = IMI^ 1(n) fovveV 
and |L(v)| < ||/|| L 2 (n) ||v|| L2(n) < ||/lb (ii) ||v|| ffl(£J) for veV. 

Then the problems (3.3) and (3.4) respectively become 

(5.4) to find ueV, J(u) < J(y) for all veV and 

(5.5) to find ueV, ((u, <p)) - (f, (p)i?-(a) for all tpeV. 

From what we have seen in Section|2]fhese two equivalent problems 
have unique solutions. 

The Problem ( 15.51 1 is the weak (or variational) formulation of the 
Dirichlet problem (ifT2 - (/>), Neumann problem ifT\ = <p and the 
mixed boundery value problem in the general case. 

We now interprete the solutions of Problems ( 15.21 ) when they are suf- 
ficiently regular as solutions of the classical Dirichlet (resp. Neumann 
of mixed) problems. 

Suppose we assume ueC 2 (0) n V and veC^D.) n V. We can write 
using the Green's formula (14. 10ft 

a(u, v) = ((u, v)) = I (-Am + u)vdx + I du/dn.vdcr = I fvdx 
Jn Jr Jn 

(5.6) i.e. I (—Au + u — f)vdx+ I du/dn.vdcr = 0. 

Jn Jr 

We note that this formula remains valide if ueH 2 (Q) n V for any 
veV. 

First we choose ve^(O) c V (enough to take v6Cq(Q.)(Q.) c V) then 
the boundary integral vanishes so that we get 

f (—Aw + u - f)vdx - Vve^(Q). 
Jn 

Since is dense in L 2 (H) this implies that (if ueH 2 (Q)) u is a 

solution of the differential equation 

(5.7) -Au + « - / = o in O (in the sense of L (Q)). 
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More generally, without the strong regularity assumption as above, 
u is a solution of the differential equation 

(5.8) -Au + u - f - in the sense of distributions in O. 

Next we choose veV arbitrary. Since u satisfies the equation \5M in 
Q. we find from ( 15.61 ) that 

(5.9) \ du/dnvdcr = VveV, 
Jr 2 

whcih means that du/dn = on T in some generalized sense. In fact, by 
trace theorem yveH^(Y) and hence du/dn = in //~3(r) (see Lions and 
Magenese ll3"2ll ). Thus, if the Problem ( 15 .21 has a regular solution then 
it is the solution of the classical problem 

-Au + u - f inQ. 

(5.10) hi = on H 

du/dn = on T2 



The Problem (15.101) is the classical Dirichlet (resp. Neumann, or 
mixed) problem for the elliptic differential operator -Au + u if I"^ = (f> 
(resp. Ti = (p or general YiSi)- 

Remark 5.1. The variational formualtion (15.51) of the problem (15.51) is 
very much used in the Finite elements method. 

Example l5.1l is a special case of the following more general problem. 

Example 5.1'. Let Q, T = Ti UT2 and V be as in Examnle \5.1\ Suppose 
given an integro-differentail bilinear form ; 

(5.11) a(u, v) = I y aij(x)(Piu)(Djv)dx + I ao(x)uvdx, 

where the coefficients satisfy the following conditions: 39 
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(5.12) 



ayeL°°(Q), a eL°°(£l); 

condition of ellipticity there exists a constant a > such that 

ZijayWlil; > or Si if for| = (Ij,- - • ,| n )eR"a.e. in Q; 
a (x) > a > 0. 

It follows by a simple application of Cauchy-Schwarz inequality that 
the bi-linear form is well defined and bi-continuous on V: for all u, veV, 

\a(u,v)\ < max(||ay||z,=o (n) ,||a || L oo (n) )|| M || v ||v|| v 

a(; •) is also coercive ; by the ellipticity and the last condition on a 



Suppose given feL 2 (Cl) and gtL 2 ^). Then the linear functional 



on V is continuous and we have again by Cauchy-Schwarz inequality 

|L(v)| < ||/|| L 2 (n) ||v|| L 2 (n) + ||glb ( n)IMb (D 

^ + llgll£2(r))IMIv b y trace theorem. 

We introduce the functional 

v h» J(v) = a(v, v) - L(v). 

For the Problem i5A\ of minimising H on V we further assume 

ay = a ;i -, 1 < /', < n. 

If a (J are smooth functions in Q. and u is a smooth solution of the 
Problem (I5.5t we can interprete w as a solution of a classical problem 
using the Green's formula as we did in the earlier case. We shall indicate 



a(v, v) > a 



£ \D iV \ 2 + M 2 )dx = a\\v\\ 2 v 



veV. 



(5.13) 
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only the essential facts. We introduce the formula differential operator 

77 

(5.14) Au = - Dj(aijDiii) + a u. 

If atj are smooth (for instance, a^eC (£1)) then A is a differential 
operator in the usual sense. By Green's formula we find that 
(5.15) 

a{u,v) - - ^ ^ Dj(ajjDiu) + <kj(Plu)nj(x)vdcr + ^ a uvdx 

where (ni(x), ■ • ■ , n n (x)) are the direction cosines of the exterior normal 
to T at x. The operator 

(5.16) ^ aij{Dju)nj{x) - du/driA 

is called the co-normal derivatives of u respect to the form a(-, ■)■ Thus 
we can write (I5.15t as 

(5.15) ' a(u, v) = J ' (Au)vdx + J du/driAvdcr 

and hence the Problem (I5.2t becomes 

J (Au - f)vdx + I (du/diiA - g)vdcr = 0. 
Jn Jr 

Proceeding exactly as in the previous case we can conclude that the 
Problem (I5.5t is equivalent to the classical problem. 

!Au = f in Q, 

u — on Ti 

du/dn A = g on T 2 

Example 5.2. Let V = Hl(Q) = {v\veH\Q),yv = 0}, and / be the 
functional on V: 



V»J(v) = -\\v\\ Z y-(f,V) L 2 



(O) 
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where feL 2 (Q.) is a given function. Suppose 41 

(5. 19) K = {v\veV, v(x) > a. e. in Q] 

It is clear that K is convex and it is easily checked that K is also 
closed in V. 

In fact, if v n eK and v n — > v in V then, for any (pe£$(Q.) such that 
ip > in Q. we have 



I v<£>g?jc = lim I v n ipdx > 



(the first equality is an immediate consequence of Cauchy-Schwarz in- 
equality since v, ipeL 2 (Q.)). This immediately implies that v > a. e. in 
Q and hence veK. 

We know from Section[5]that the minimising problem. 

(5.20) ueK; J{u) < J(v), VveK 
is equivalent to the variational inequality: 

(5.21) ueK; a(u, v — u) > L(y - u) = (/, v - u) L 2^, VveK 

and both have unique solutions. In order to interprete this latter problem 
we find on applying the Green's formula. 

(5.22) I {—Au + u- f)(v - u)dx + I du/dn(u - v)dcr > 0, VveK. 
Ja Jr 

Since veK c V = Hi (Q.) the boundary integral vanishes and so 

(5.23) I {—Au + u — f)(y — u)dx > 0, WveK. 
If tpeK, taking v = u + ipeK we get 

(—Au + u — f)ipdx > 0, tpeK 



Jn 



IQ. 

42 from which we conclude that -Au + u- f > a.e. in O. For, if a» is an 
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open sub-set of Q. where -Am + «-/>0we take a ipe3)(Cl) with <p > 
and supp cp c u>. Such a clearly belongs to K and we would arrive at a 
contradiction. In particular, this argument also shows that on the subset 
of CI where u > is satisfies the equation -Am + u = f. 
Next if we choose v = 2ueK in (I5.23t we find 



Jo 



Au + u - f)udx > 



and if we choose v = ^MeA' we find 



I (-Am + w - f)udx < 0. 
Jn 

These two together imply that 



(5.24) 



(—Au + u — f)u = 



Thus the solution of the variational inequality can be interpreted 
(when it is sufficiently smooth) as the (unique) solution of the problem : 



(5.25) 



(-Am + m - f)u = in Q 

-Am + m - / > a. e. in O 

m > a. e. in O 

m = on T. 



Remark 5.2. The equivalent minimisation problem can be solved nu- 
merically (for example, by Gauss-Seidel method). (See Chapter 0] § 

ED- 

Exercise 5.2. Let O be a bounded open set in W with smooth boundary 
T. Let V = H l (Q) and K be the subset 

(5.26) K - {v\veH l (CI); yv > a. e. on T} 
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Once again K is a closed convex set. To see that it is closed, if v n eK 
is a sequence such that v n — > v in V then since y : H 1 (Q.) = V — > L 2 (T) 
is continuous linear yv n — > yv in L 2 (T). Now, if ^eL 2 (T) is such that 
(f > a. e. on T then 



on the closed convex set K is equivalent to the variational inequality 

(5.28) ueK : a(u, v — u) = ((u, v - u))y > (/, v - u) L 2^, VveK. 

Assumig the solution u (which exists and is unique from section 
is sufficiently regular we can interprete u as follows. By Green's formula 
we have 



If ipe&(Q) the boundary intergal vanishes for v = u ± tp which be- 
longs to K and 




from which we deduce as in Example l5.1l that yv > 0. 
Let feL 2 (Q.) be given 
The problem of minimising the functional 




(5.29) 





which implies that 
Next since v = 



-Au - f in Q. 

2u and v = \u also belong to K we find that 




44 which implies that 



— u - a.e. on T. 
on 
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Thus the variational inequality ( 15.281 ) is equivalent to the following 
Problem: 



(5.30) 



-Au - f in Q 

du/dn u — on T 

du/dn > on T 

u > on T 



One can also deduce from ( 15. 301 ) that on the subset of T where u > 0, 
u satisfies the homogeneous Neumann condition 

F = °- 

on 

Example 5.3. Let O be a bounded open set in W with smooth boundary 
T and 1 < p < +oo. We introduce the space 

(5.31) V = {v\veL 2 (Q);DjveL 2p (Q),j = 1,- •• ,n) 

provided with its natural norm 



(5.32) 



v ' * ||v||y = ||v|| L 2 (n) + 2_j \\Djv\\ L 2 P{Cl) . 

7=1 



Then V becomes a reflexive Banach space. Consider the functional 
/ : V -» R: 



(5.34) v i-» 



7(v) = ^ z X |D ^ v|2p ^ + 1 X lv|2 ^ ~ X fvdx 



i 



where feL 2 {Q) is given. If we set, gj(t) = ~\tf p we § et a C 1 -function 

g; : R 1 — » K. 1 and we have g'.(f) - |f| 2 ^- 2 f for all / = 1, • • • ,n. Then 
from Exerices I. 1.1, the functional 
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is once G-differentiable in all directions and its G-derivative in any di- 45 
rection <p is given by 

^ I g'j(u)<pdx, V<peV. 
j J(fi 

Hence we obtain, in our case, 

(5.35) J'(u,(p) = V I \Dju\ 2p ~ 2 (Dju)(Dj(p)dx+ I w^<ix- I fipdx. 

j Jn Jn Jn 

Then the minimisation problem 

(5.36) weV; J(u) < J(v), VveV, 
is equivalent by Theorem 13. II to the problem 

(5.37) ueV;J'(u,<p) = 0,V<peV. 

We can verify that J is strictly convex; for instance, we can compute 
J"(u; if, (f) for any ipeV and find 

(5.38) /'(a; if, if) = (2p - 1) V f QDju\ 2(p -V\Dj<p\ 2 + -L 2 )J* > 

j Jn 2 

for any <^eV with (p + 0. Then Proposition ^ 13.21 implies the strict 
convexity of J. 
We claim that 

J(v) — » +oo as ||v||v — > +oo. 
In fact, first of all by Cauchy-Schwarz inequality we have 

< H/llz2(n)IMIz.2(n) 



I fvdx 
Jn 



and hence 
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so that 

7 

which tends to +oo as ||v||v — » +°°- 

Then by Theorem ll.2l the minimisation problem d5.36l > has a unique 
solution. 

Finally, if we take ipe@(Q.) c V in the equation ( 15.351 ) we get 

f ( Y |D 7 - M | 2p ^ 2 (D^)(D ; y) + u(p - f(p)dx = 0. 
Jn ^ 

On integration by parts this becomes 

f (V -Dj(\Dju\ 2p - 2 Dju) + u — f)(pdx = 0, 



Thus the solution of the minimising problem (I5.36t for J in V can 
interpreted as the solution of the nonTinear problem 

(5.39) ueV, - J] Dj(\Dju\ 2p ~ 2 Dju) + w = / in Q. 



We have used the fact 2>(£l) is dense in LP (O) where — I = 1. 

__ P P' 

The problem d5.39t is a generalized Neumann problem for the non- 
linear (Laplacian) operator 

(5.40) - J] Dj(\Dju\ 2p - 2 Dju) + u. 

i 

Example 5.4. Let Q. and T be as in the previous example and 

(5.41) V = H l (Q) nL 4 (fi). 

We have seen in Section 0] that V is a reflexive Banach space for its 
natural norm 

(5.42) v i-> ||v||#i (n) + ||v|| L 4 (n) = ||v|| v . 
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Consider the functional J on V given by 



(5.43) 



v » J(y) = -\\vf HHn) + ^\\v\\% (n) - (f,v)i? ( n), 



47 where feL 2 {Q.) is given. It is easily verified that J is twice G - differen- 
tiable and 



Thus J"(u;<p,<p) > for ueV, <peV with ip + which implies that 
/ is strictly convex by Proposition ^ 13.21 As in the previous example 
we can show using Cauchy-Schwarz inequatliy (for the term (/, v)^^)), 
that 



Then by Theorem 11.21 the minimisation problem for J on V has a 
unique solution. An application of Green's formula shows that this 
unique solution (when it is regular) is the solution of the non-linear 
problem : 



Remark 5.3. It is, ingeneral, difficult to solve the non-linear problem 
d5.43t numerically and it is easier to solve the equivalent minimisation 
problem for J given by (I5.44t . 

Remark 5.4. All the functionals considered in the examples discussed 
in this section are strictly convex and they give rise to strongly monotone 
operators. We recall the following 




(Hence J has a gradient) 




J(v) — > +oo as ||v||y — > +oo. 



(5.44) 
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48 Definition 5.1. An operator A : U c V — > V on a subset £/ of a normed 
vector space into its dual is called monotone if 

< Au- Av, u-v >vxv> for all u, veil. 

A is said to be strictly monotone if < Au - Av, u-v > v > xV > for any 
pair of distinct elements u, veV (i.e. if u ± v). (See, for instance, [44]). 



Chapter 3 



Minimisation Without 
Constraints - Algorithms 

We have considered in the previous chapter results of theoretical nature 49 
on the existence and uniqueness of solutions to minimisation problems 
and the solutions were characterized with the aid of the convexity and 
differ entiability properties of the given functional. Here we shall be 
concerned with the constructive aspects of the minimisation problem, 
namely the description of algorithms for the construction of sequences 
approximating the solution. We give in this chapter some algorithms 
for the minimisation problem in the absence of constraints and we shall 
discuss the convergence of the sequences thus constructed. 

The algorithms (i.e. the methods for consttucting the minimizing 
sequences) described below will make use of the differential calculus 
of functionals on Banach spaces developed in Chapter ^ We shall be 
mainly concerned with the following classes of algorithms: 

(1) the method of descent and 

(2) generalized Newton's method. 

We shall mention the conjugate gradient method only briefly. The 
first class of methods mainly make use of the calculus of first order 
derivatives while the generalized Newton's method relies heavily on the 
calculus involving second order derivatives in Banach spaces. 



49 
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Suppose V is a Banach space and / : V — > R is a functional on 
it. The algorithms consist in giving an interative procedure to solve the 
minimisation problem: 

to find ueV, J(u) - inf J(v). 

veV 

50 

Suppose / has unique global minimum u in V. We are interested in 
constructing a sequence u\, starting from an arbitrary u eV, such that 
under suitable hypothesis on the functional /, u\ converges to u in V. 
First of all, since u is the unique global minimum the sequence J{uu) is 
bounded below by J(u). It is therefore natural to construct u\ such that 

(i) J(uk) is monotone decreasing 

This will imply that J(ut) converge to J(u). Further, if / admits a 
gradient G then we necessarily have G{u) = so much so that the 
sequence u\ constructed should satisfy also the natural require- 
ment that 

(ii) G{uk) — > in V as k — > oo 

Our method can roughly be described as follows: If, for some k, 
uu is aheady known then the next iterate uu+\ is determined by 
choosing suitably a parameter pi > and a direction wtiwteV, 
Wk + 0) and then taking 

Uk+l - u k - PkWk- 

We shall describe, in the sequel, certain choices of pt and wt which 
will imply (i), (ii) which in turn to convergence of lit to u. We shall call 
such choices of pt, wt convergent choices. 

To simplify our discussion we shall restrict ourselves to the case of a 
Hilbert space V. However, all our considerations of this chapter remain 
51 valid for any reflexive Banach space with very minor changes and we 
shall not go into the details of this. As there will be no possibility of 
confusion we shall write (•, •) and || • || for the inner product (•, -)y and 
|| • || y respectively. 



1 . Method of Descent 
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1 Method of Descent 

This method includes a class of algorithms for the construction of min- 
imising sequences Uk- We shall begin with the following generalities in 
order to motivate and explain the principle involved in this method. 
Let J : V —> R be a functional on a Hilbert space V. 

1.1 Generalities 

Starting from an initial value u eV we construct u\ iteratively with the 
properties described in the introduction. Suppose uu is constructed then 
to construct ut+i we make two choices: 

(1) a direction Wk in V called the "direction of descent" 

(2) a real parameter p = pk, and set Uk+i = Uk - PkWk so that the 
sequence thus constructed has the required properties. The main 
idea in the choices of Wk and pk can be motivated as follows: 

Choice ofwk- We find WkeV with ||wfc|| = 1 such that the restriction 
of / to the line in V passing through Uk and parallel to the direction 
Wk is decreasing in a neighbourhood of u^. i.e. the function R3p^ 
J(uk + pwk)eR- is decreasing for \p\ sufficiently small. 52 

If J is G-differentiable then we have by Taylor's formula 

J(u k +pw k ) = J(u k ) + J'(uk,pw k ) + ... 

= J{u k ) +pJ'{Uk,W k ) + ... 

(by homogeneity of <p h-> J'iu, <p)). For \p\ small since the dominant term 
in this expansion is pJ'(uk, Wk) and since we want J(uk + pwk) < J(uk) 
the best choice of Wk (at least locally) should be such that 

pJ'(uk, Wk) < and is largest in magnitude. 

If J has a gradient G then 



pJ'(u k , w k ) - p(G(u k ), w k ) < 
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and our requirement will be satisfied if w k is chosen proportional to 
G(u k ) and opposite in direction. We note that, this may not be the best 
choice of w k from global point of view. We shall therefore write 

J(u k - pw k ) with p > 

so that J(u k - pwk) \ as k increases for p > small enough. 

Choice of p(= p k ). Once the direction of descent w k is chosen then the 
iterative procedure can be done with a constant p > 0. It is however 
more suitable to do this with a variable p. We shall therefore choose 
p = pk > in a small interval with the property J(u k - p k w k ) < J(u k ) 
and set 

U k +\ = uu -p k w k . 
We do this in several steps. Since, 

j = inf J(y) < J(u k+ i) < J(u k ) 

veV 

53 we have 

J(u k ) - J(u k+ \) > and lim (J(u k ) - J{u k+ \)) = 

k— >+oo 

because J(uk) is decreasing and bounded below. If / is differentiable 
then Taylor's formula implies that 

J(uk) - J(u k+ i) behaves like J'(u k , u k+ \ - u k ) - p k J'(u k , w k ) 

so that it is natural to require that 

p k > 0,p k J'(u k ,w k ) — > as k — > +oo. 

Roughly speaking, we shall say that the choice of p k is a "conver- 
gent choice" if this condition implies J'(u k ,w k ) — > as k — > +oo. If, 
moreover, J has a gradient G then choice of the direction of descent w k 
is a "convergent choice" if J'(u k , w k ) - (G(u k ), w k ) —> implies that 
||G(Mjk)|| -» as k -» +oo. 

The above considerations lead us to the following definitions which 
we shall use in all our algorithms and all our proofs of convergence. 
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Definition 1.1. The choice of p k is said to be convergent if the conditions 

Pk > 0,u k+ i - u k - p k w k 

J{u k ) - J(u k+l ) > 0,lim k ^ +oo (J(u k ) - J(u k +\)) = 
imply that 

lim J'(u k ,w k ) - 0. 

>+oo 

Suppose J has a gradient G in V. 

Definition 1.2. The choice of the direction w k is said to be convergent if 
the conditions 

w k eV, J'(u k , w k ) > 0. lim J'(u k ,w k ) = 

k — >+oo 

imply that 54 

lim ||G( M fc)|| = 0. 

k — >+oo 

1.2 Convergent choice of the direction of descent w k 

This section is devoted to some algorithms for convergent choices of w k . 
In each case we show that the choice of w k described is convergent in 
the sense of Definition [O] 

w-Algorithm 1. We assume that J has a gradient G in V. Let a real 
number a be given with < a < 1. We choose w k eV such that 

| (G(B t )/||G(Bj t )||,w*)>ar>0. 
1 W = l. 

Proposition 1.1. w-Algorithm 1 gives a convergent choice ofw k . 
Proof. We can write 

J'(u k , Wk) = (G(u k ),Wk) 

so that by (11.11) 

J'{u k ,w k ) > a\\G(u k )\\ > 
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and hence 



J'(u k , w k ) —* implies that ||G(wyOII — > as k — » +oo. 



□ 



We note that (11.11) means that the angle between w k and G(m^) lies 
in ] - n/2, n/2[ and the cosine of this angle is bounded away from by 
a. 

w-Algorithm 2 - Auxiliary operatoe method. 

This algorithm is a particular case of w-algorithm 1 but very much 
more used in practice. 

Assume that J has a gradient G in V. 

Let, for each k, B k e££ (V, V) be an such that 

Bk are uniformly bounded: there exists a constant y > 



Bk are uniformly V-coercive: there exists a constant a > 
such thaat (B k f, if/) > a\\ft\\ 2 , if/eV. 

Let us choose 



(1.2) 



such that \\B u i/f\\ < y\\i//\\ : if/eV. 



(1.3) 



w k = B k G(u k )/\\B k G(u k ) 



Proposition 1.2. The choice A1.3t ofw k is convergent. 



Proof. As before we calculate 



J'{u k ,w k ) = (G(u k ),w k ) = (G(u k ), B k G{u k )l\\B k G(u k )\\) 



which, by uniform coercivity of B k , is 



> a\\G(u k )\\ 2 /\\B k G(u k )\\ 

> ay~ l G{u k ) by uniform boundedness of B k . 
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This immediatly implies that 

/(«*, Wk) > and if J'(u k , w k ) -> then ||G(ift)|| -> 

and hence the choice of is convergent. 
Moreover, again by il.31 . we get 

(G(Hi)/||G(Hji)||,w*) = (G(in)/l|G(«i)||,B4G(«n)/||fl*G(Ht)ID > ay' 1 > 0, 
which means that this algorithm is a particular case of w-Algorithm 1 . 

Remark 1.1. In certain (for example, when are symmetric operators 56 
satisfying ( U.2t ) this method is equivalent to making a change of vari- 
ables and taking as the direction of descent the direction of the gradient 
of / in the new variables and then choosing wt as the inverse image of 
this direction in the original coordinates. 

Consider the functional J : V = R. 2 — > R of our model problem of 
Chapterd fT] 

R 2 3vh J(v) = -a(y, v) - L(v) = -(Av, v) R2 - (f, v) R ieR. 

Since a(-, •) is a positive definite quadratic form, {vdR 2 , J(v) = 
constant } represents an ellipse. can be chosen such that the change 
of variable effected by B^ transforms such an ellipse into a circle where 
the gradient direction is well-known i.e. the direction of the radial vector 
through Uk (in the new coordinates). 

w-Algorithm 3 - Conjugate gradient method 

There are several algorithms known in the literature under the name 
of conjugate gradient method. We shall, however, describe only one of 
the algorithms which generalizes the conjugate gradient method in the 
finite dimensional spaces. (See [20] [22] and [24]). 

Suppose the functional / admits a gradient G(u) and a Hessian H{u) 
everywhere in V. Let u a eV be arbitrary. We choose w = G(u )/\\G(u )\\ 
(We observe that we may assume G(u a ) + unless u itself happens to 
be the required minimum). If Uk-\,Wk-\ are already known then we 
choose pk-i > to be a points of minimum of the real valued function 
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i.e. p k -i > and J{u k -\ - p k -\w k -{) - inf J{u k -\ - pw k -\). 

p>0 

57 Since J is G-differentiable this real valued function of p is differen- 

tiable everywhere in R+ and 



d_ 

dp 



—J(u k -\ - pw k -\)\ p=Pk _ x = 0, 



which means that, if we set 

(1.4)! U k - U k -1 -p k -\yv k -\ 

then we have 

(1-5) (G(u k ),w k - X ) = 0. 

Now we define a vector w k eV by 

w k = G{u k ) + A k w k ~]_ 
where A k eR is chosen such that 

(H(u k )w k ,w k -i) = 

Hence w k is given by 

(H(u k )G(u k ),w k -i) 



(1.4) 2 A k = - 



(H(u k )w k -i,w k -i) 



We remark that in applications we usually assume that H{u) (for any 
ueV) defines a positive operator and hence the denominator in |(1.4)2| 
above i non-zero (see Remark PT^l beiow). Then the vector 

(l-4) 3 w k = w k /\\w k \\ 

defines the direction of descent at the k-th stage of the algorithm. 

This algorithm is called conjugate gradient method because of the 
following remark. 
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58 Remark 1.2. Two directions tp and ip are said to be conjugate with re- 
spect to a positive definite quadratic form a(-, •) on V if a{<p, tp) = 0. In 
this sense, if H(u k ) defines positive definite quadratic form (i.e. H(u k ) 
is a symmetric positive operator on V) two consecutive choices of di- 
rections of descent Wk-i,Wk are conjugate with respect to the quadric 
(H(u k )w, w) = I. We recall that in the plane R 2 such a quadric rep- 
resents an ellipse and two directions <p, ip in the plane are said to be 
conjugate with respect to such an ellipse if (H(u k )<p, i/0 = 0. 

Now we have the following 

Proposition 1.3. Suppose that the functional H admits a gradient G(u) 
and a Hessian H(u) everywhere in V and suppose further that there exist 
two constants C > 0, C\ > such that 

(i) (H(u)(p, if) > C \\ip\\ 2 for all u, <peV and 

(ii) \(H(u)tp,tp)\ < dWtpWWtpWforallu^^eV. 

Then the w-Algorithm 3 defines a convergent choice of the wt- 

Proof. It is enough to verify that wt satisfies the condition ( ll.lt . First 
of all, in view of the definition of wt and (ll.5t we have 



We shall show that this is bounded below by a constant a > (inde- 
pendent of k). 



(G(u),w k ) = \\G{u k )\\ 2 



so that 



{G{u k )/\\G{u k )\\,w k ) = \\G{u k )\\\\w k \r i 



□ 




Here, in view of the assumptions (i) and (ii) we find that 



4l|w*-i|| 2 = 



(Hju^Gju^w^) 2 
{H{u k )w k -]_,w k -i) 2 



Ikt-ill 2 
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^(C^dHGM) 2 



so that 



II^II 2 <I|G(^)|| 2 (1+C - 2 C2). 
Hence, taking the constant a > to be (1 + C~ 2 C 2 )~5 we get 



l|G(K*)lll[w t ir > or > 



which proves the assertion. 

1.3 Convergent Choices of p k 

We shall describe in this section some algorithms for the choice of the 
parameter p# and we shall prove that these choices are convergent in the 
sense of our Definition ll.il 

Given the idrection wu of descent at the k th stage we are interested 
in points of the type 



and therefore all out discussions of this section are as if we have func- 
tions of a single real variable p defined in R.+. 

We shall use the following notation throughout this and the next 
sections in order to simplify our writing: 



Smilarly, when J has gradient G{u) and a hessian H{u) at every 
points u in V, we write 



u k -pw k ,p > 0, 



Notation 






G(u k - pw k ) = Gp for p > 0. 
G(u k ) = Go 
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60 and 

H(u k - pw k ) = Hp for p > 0, 
H(u k ) H k 

We shall make the following two hypothesis throughout this section. 
Hypothesis (HI) : lim J(v) = +oo. 

Hypothesis (H2) : J has a gradient G(u) everywhere in V and satis- 
fies a (uniform) Lipschitz condition on every bounded subset of V: for 
every bounded set K of V there exists a constant Mk > such that 

\\G(u) - G(v)|| < Md| M - v|| for all u, veK. 

In particular, if / has a Hessian H(u) everywhere in V and if H(u) is 
bounded on bounded sets of V then an application of Tayler's formula to 
the mapping V3kh G(u)eV = V shows that / satisfies the hypothesis 
(H2). In fact, if u, veV then 

IIG(m) - G(v)\\ - sup \(G(u) - G(v),^)|/|M| 

- sup \(H(u + 9{u - v)){u - v), ^OI/IMI - const.\\u - v||, 

since u,veK and 9e]0, 1[ imply that v + 6(u - v) is also bounded and 
hence H(v + 9(u - v)) is bounded uniformly for all 9e]0, 1[. 

Now suppose given a u eV at the beginning of the algorithm. Start- 
ing from Wo we shall construct a sequence w& such that J{uk) is decreas- 
ing and so we have J{uk) < J{u Q ). We are interested in points of the type 
Uk - pwk such that J(u,k - pwt) < J{uu)- 

We shall now deduce some immediate consequences of the hypoth- 
esis HI and H2, which will be constantly used to prove the convergence 
of the choice of pu given by the algorithms of this section. 
Let us denote by U the subset of V: 

U = {v\veV; J(v) < J(u a )}. 

The set U is bounded in V. In fact, if U is not bounded then we 
can find a sequence vjeU such that ||v y || — » +oo. Then J(vj) —> +oo by 61 
Hupothesis HI and this is impossible since VjeU. 
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We are thus interested in constructing a sequence u k such that 

u k eU and J{u k ) \ . 

Also since by requirement J(u k - pw k ) < J(u k ) it follows that u k - 
pw k eU and then p will be bounded by diam U; for, we find using triangle 
inequality: 

< p = \\pwk\\ = \\u k - (uk - pWk)\\ < diamU. 

Let us denote the constant Mjj > given by Hypothesis H2 for the 
bounded set U by M. 

Now the points uu -pwu, ut -fiw^ belongs to U if p, fi > are chosen 
sufficiently small. Then 

\\G k p - Gj|| - \\G(u k - pw k ) - G(u k - fiw k )\\ 
< M\p - u\\\w k \\ = M\p - n\; 



i.e. we have, 
(1.6) 



||G*-G*|| <M\p-fi\ 
\\G k p -G k J\ <Mp 



Since J' k p = J'(u k - pw k , w k ) = (G(u k -pw k ), w k ) = (G p , w k ) we also 
find from ( 11.61 ) that 



(1.7) 



\\J' k p -J' k ,\ <M\p-ti\ 
\\J' k -J' k \ <Mp. 



We shall suppress the index K when there is no possibility of confu- 
sion and simply write G p ,J p , J' p etc. respectively for G k ,J k , J' p etc. 
62 By Taylor's expansion we can write 

J p = J(u - pw) = J{u) - pf(u - pw, w) 

for some p such that < p < p. i.e. we can write 

(1.8) J p = J -pJ'p. 
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We can rewrite dl.Bt also as 

J P - Jo -pJ'o + p(J'o - J'p), 
which together with HJl gives 

J p < J - pJ'o + Mpp, 

that is, since < p < p 

(1.9) J p <J -pJ'o + Mp 2 . 

We shall use (ll.8t and dl.9t in the following form 

(1.8) ' &J P =pJ'p, 

(1.9) ' pJ' -Mp 2 < aJ p . 

We are now in a position to describe the algorithms for convergent 
choices of the parameter p^. 

p- Algorithm 1. Consider the two functions of p > given by 

J p - J(ut - pwu) and T(p) = J Q - pJ' Q + Mp 2 . 

Then J = T(0) and OJ) says that J p < Tip) for all p > 0. Geomet- 
rically the curve y = J p lies below the parabola y = T(p) for p > in 
the (p, y) -plane. Let p > be the points at which the function Tip) has 

a minimum. Then — L = a = implies - J + 2Mp - so that we have 
dp 

(1.10) p = J' I2M, Tip) = inf T{p). 

Let C be a real number such that 

(1.11) 0<C<1. 

We choose p = in the interval [Cp, (2 - C)p], i.e. 

(1.12) C<p/p<(2-C). 
Then we have the 
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Proposition 1.4. Under the hypothesis (HI), (H2) the choice ( 17.721 ) of 
p = Pk is a convergent choice. 

Proof. Since T has its minimum at the points p = p we have by (11.111) 
Cp < p < (2 - C)p. Moreover T{p) decreases in the interval [0,p] while 
it increases in the interval [p, (2 - C)p] as can easily be checked. Hence, 
if p satisfies d!.12t then we have two cases: 

T p < Tcp if Cp < p < p and 
T P <T (2 - C) pifp<p<(2-C)p. 

□ 

Since T Cp = J - CJ' /2MJ' + M(CJ' /2M) 2 - J - (2 - C) 
C(/' ) 2 /4M,) 

T ( 2-c)p = Jo-(2-C)J' /2MJ' +M((2-C)J'o/2M) 2 = J -(2-C)C(J' ) 2 /AM 

using the value of p given by dl.lOt and since J p < T p for all p > we 
find that (in either of the above cases) 

Jp<T p <J -(2- C)C(J'l)IAM. 

This immediately implies that 

(1.13) C(2 - C)(J' f/4M < AJ p . 

64 In order to show that the choice ( 11.121 is convergent we see that 

dl- 13i is nothing but 

C(2 - C)IAM(J\u k ,w k )f < J(u k ) - J(u k -pw k ) < J(u k ) - J(u k+1 ) 

since J(u k+ i) = J(u k - p k w k ) = inf p>0 J(u k - pw k ) i.e. J(u k+1 ) < J k p . 
Hence if J(u k ) - J(u k+ i) — > then J'(u k ,w k ) — > as k — > +oo, which 
proves that the choice of p k such that 

C < p k p k l < 2 - C where p k = J'(u k ,w k )/2M 

is a convergent choice. 
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p-Algorithm 2. The constant M in the p-Alogorithm 1 is not in gen- 
eral known a priori. This fact may cause difficulties in the sense that if 
we start with an arbitrarily large M > then by dl.l2t p% will be very 
small and so the scheme may not converge sufficiently fast. We can 
get over this difficulty as described in the following algorithm, which 
does not directly involve the constant M and which can be considered 
as a special case of p-Algorithm 1. But for this algorithm we need the 
additional assumption that J is convex. 

Hypothesis H3. The functional J is convex. 

We suppose that, for some fixed h > 0, we have 



Since / is convex and has its minimum in p > such an m > 2 
always exists. 

Proposition 1.5. If J satisfies the hypothesis HI, H2, H3 then any choice 
ofp(- pk) such that 



is a convergent choice. 

Proof. Let p > be a point where J p attains its minimum. Then J'p - 
0, Jp < J p for all p > and by ( 11.141 1 we should have 



(1.14) 




(1.15) 



(m - l)h < p < mh 



(1.16) 



(m - l)h < p < (m + \)h. 



Then dl.7t will imply 



0< J' 



\J'p-J'o\<M 



and thus we find 



(1.17) 



2p 



J'o/M < p 



and 



(1.18) 



2p/(m +l)<h. 
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This, together with the fact that m > 2, will in turn imply 



2p/3 <(m- l)h. 



As J p decreasesd in < p < mh we get 



&J(m-\)h - Jo ~ J(m-l)h > Jo ~ 



A/( 2 p/3)- 



If we now apply the p- Algorithm 1 with C = 2/3 in dl.l2t and in 
1.13l > then we obtain, from the above inequality, 



which proves that p — (m - \)h is a convergent choice. Similarly, if 
pe[(m - \)h, mh] (i.e. (U.15t ) then the same argument shows that 



66 and hence any Pk = P satisfying d!.15t is again a convergent choice. 

Some Generalizations of p-Algorithm 2. 

In the above algorithm a suitable initial choice of h > has to be 
made. But such an h can be either too large or too small and if for 
example h is too small then the procedure may become very long to use 
numerically. In order to over come such diffeculties we can generalize 
p- Algorithm 2 as follows. 

If the initial value of h > is too small we can repeat our arguments 
above with d!.14t replaced by 



and if the initial value of h is too large we can compute / at the points 
h h h 

where p is an integer > 2. Every such procedure gives a 

p p- p j 

new algorithm for a convergent choice of pi< = p. 
p-Algorithm 3. We have the following 




(1-19) 



&J{m-\)h > 2/9M(/' D ) 2 , 



(1.20) 



Up > U {m -l)h > 2/9M(/' D ) 2 , 



(1.14)' 
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Proposition 1.6. Assume that J satisfies the hypothesis HI - H3. If 
h > is such that 



(1.21) 



AJh/h > (1 -C)J' , 
Aj 2h /2h < (1 - C)/' c 



with some constant C, < C < 1 (p,t =)p = h is a convergent 
choice. 



Proof. From the inequality ( (1.9)' and the second inequality in A1.2U 
we get 

2hf - (2h) 2 M < Aj 2h < (1 - Q2hJ' 

and hence 

Cp = CJ' I2M < h. 

□ 

Now the first inequality in dl.21i implies 

(1.22) Aj h > h(l - QJ'o > C(l - C)(J' ) 2 /2M, 

which proves that p = h is a convergent choice since A//, = J(uk) - 
J(uk - hwk) — > implies that J' = J'(u.k,Wk) — > as k — > oo. 

We shall now show that there exists an /j > satisfying (I1.2U . We 
consider the real valued function 

<ff(p) = Aj p /p - (1 - QJ'o 

of p on R + and observe the following two facts: 

(1) ifj(p) > for p > sufficiently small. In fact, since Aj p /p — > 
7'o > we have \Aj p /p - J' \ < CJ' for p > sufficiently small, 
which, in particular, implies the assertion. 

(2) if/{p) < for p > sufficiently large. For this, since u^, wt are 
already determined (at the (k + l)th stage of the algorithm) we 
see that \\pwu\\ — > +o° and hence \\uk - pwk\\ — » +°°- Then, by 
hypothesis (HI), 

J(ut - pwk) — > +oo as p — > +oo 
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so much so that 

A/ p < < p(l - C)/' for p > 

sufficiently large, which implies the assertion. 

Thus the sign of ijj changes from positive to negative, say at some 
p = h > 0. Then, for instance, h = 3h /4 will satifsy our requirement 

More precisely, we can find h satisfying ( 11.21ft in the following iter- 
ative manner. Assume that < C < 1 is given. 

First of all we shall choose a t arbitrarily (> 0) and we compute the 
difference quotient aJ t /t. This is possible since all the quantities are 
known. Then there are two possible cases that can arise namely, either 

(a) AJJt > (1 - C)J' 

or(b) A/ t /t < (1 - C)J' Q . 

68 Suppose (a) holds. Then we compute A/ 2r / 2r and we will have to 

consider again two possibilities: 

either(a)i a/ 2t / 2t < (1 - C)J' 0> 

Or(a) 2 A7 2 r/2r > (1 -C)J' . 

If we have the first possibility (a)i then we are through we can 
choose h - t itself. If on the order hand {0)2 holds then we repeat 
this argument with r replaced by 2r. 

Next suppose (b) holds. We can consider two possible cases: 

either^)! a7 t/2 |t/2 > (1 - C)J' , 

or{b) 2 a7 t/2 |t/2 < (1 - C)/ . 

Once again, in case {b)\ holds we are through and we can choose 
h = r/2. In case (&) 2 holds we repeat this argument with t replaced by 
r/2. 

Remark 1.2. It was proposed by Goldstein (see KTl ) that the initial 
value of t can be taken to be taken to be t = J' . 



1 . Method of Descent 



67 



p-Algorithm 4. We have the following 
Proposition 1.7. If there is a p such that 



(1.23) 

7 p 

then p =p~is a convergent choice. 



P >0, 

7p < / p pe[0,p] 



Proof. We have, by the last condition in ( 11.231 ) together with the esti- 
mate (11.7b . 

y' = \J'p-J' \ < Mp 

and hence p < 2p = J' /M < 75 using the value of p given by ( ll.lOt . 
The condition dl.23t that Jp is a minimum in [0, 75] implies J-p < J p and 



therefore 



At/p — ,i/o Jp — Jo Jp — ^Jp- 



□ 



On the other hand, taking C = 1 in dl.22t we find that 

(1.24) J' 2 /2M < Ajp < Ajp 

which proves that p = 75 is a convergent choice. 

We shall conclude the discussion of convergent choices of pk for p 69 
by observing that other algorithms for convergent choices of p can be 
obtained making use of the following remarks. 

Remark 1.3. We recall that in p-Algorithm 1 we obtained convergent 
choices of p to be close to p (i.e. C < pip < 2 - C) where p is the points 
of minimum of the curve y = T(p), which is a polynomial of degree 2. 
This method can be generalised to get other algorithms as follows: 

Starting from uq if we have found Uk and the direction of descent vv& 
then J = J{uk), J'o = J'{uk,Wk) - {G{uk),Wk) are known. Now if we 
are given two more points (say h and 2h) we know the values of J at 
these points also. Thus we know values at 3 points and the initial slope 
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(i.e. J' ). By interpolation we can find a polynomial of degree 3 from 
these. To get an algorithm for a convergent choice of p we can choose 
p to be close to the point where such a polynomial has a minimum. 
Similar method works also polynomial of higher degress if we are given 
more number of points by using interpolation. 

Remark 1.4. In all our proofs for convergent choices of p we obtained 
an estimate of the type: 

r(/ ) 2 < aj p 

where y is a constant > 0. For instance y - 2/9M in ( 1 1.201 . 
1.4 Convergence of Algorithms 

In the previous we have given some algorithms to construct a minimis- 
ing sequence for the solution of the minimisation problem: 

Problem P. to find ueV, J{u) < J(y), VveV. 

In this section we shall prove that under some reasonable assump- 
tions on the functional J any combination of w-algorithms and p - al- 
gorithms yield a convergent algorithm for the construction of the min- 
imising sequence Uk and such a sequence converges to a solution of the 
problem P. 

Let J : V — > R be a functional on a Banach space V. The following 
will be the assumptions that we shall make on J: 

(HO) J is bounded below: there exists a real number j such that -co < 
j < J(y), VveV. 

(HI) J(v) -> +oo as ||v|| -> +oo. 

(H2) J has a gradient G{u) everywhere in V and G(u) is bounded on 
every bounded subset of V: if K is a bounded set in V then there 
exists a constant M K > such that ||G(w)|| < M K for all ueK. 

(H3) J is convex. 

(H4) V is a reflexive Banach space 
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(H5) J is strictly convex 

(H6) J admits a hessian H(u) everywhere in V which is V-coercive: 
there exists a constant a > such that 

< H(u)(p, <p >y x v> <*IMIy, VweV and V<peV. 

As in the previous sections we shall restrict ourselves to the case of 
a Hilbert space V and all our arguments remain valid with almost no 
changes. We have the following result. 

Theorem 1.1. (1 ) If the hypothesis HO, HI, H2 are satisfied and ifuk 
isa sequence constructed using any of the algorithms: 

w - Algorithm i,i = 1,2 
p - Algorithm j,j= 1,3,4 

then 

\\G(uk)\ -> as k -> +oo. 

(2) If the hypothesis HO - H4 hold and if Uk are constructed using 71 
the algorithm i = 1,2, j = 1,2,3,4 then all algorithm have the 
following property: 

(a) the sequence Uk has a weak cluster point; 

(b) any weak cluster point is a solution of the problem P. 

(3) If the hypothesis HO - H5 are satisfied then 

(a) the Problem P has a unique solution ueV, 

(b) Ifutis constructed using any of the algorithms i - 1,2, j - 
1, 2, 3, 4 then 

Uk — u as k — > +oo. 

(4) Under the hypothesis HO - H6 we have 

(a) the Problem P has a unique solution u e V, 
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(b) if the sequence Uk is constructed using any of the algorithms 
i = 1,2,3, j - 1,2,3,4 then 

Uk u and moreover \\uk — u\\ < 2/a\\G(uk)\\ Vfc. 

Proof. (1) Since by (HO), J(uk) is a decreasing sequence bounded 
below: j < J(uk+\) < J(uk) < J(u ), Vfc it follows that 

lim (J(u k ) - J(u k +i)) = 0. 

k— >+oo 

Since by the p- Algorithms j(j = 1,3,4) the choice of p = pt in 
Wfc+i = Uk - pwk is a convergent choice we see that 

J'(uk,Wk) — * 0, as k — > +oo. 

Now since the choine (i) Wk is convergent (i - 1 , 2) this implies 
that 

||G(K t )|| -» as k -» +oo. 

(2) As we have seen in the previous section, if u eV then the set U = 
{v|veV, J(v) < J(u )} is bounded by (HI) and since 

J(u k+ i) < J(u k ) < ■ ■ ■< J(u Q ) Vk 

72 all the UkeU and thus Uk is a bounded sequence. Then (HA) im- 

plies that Uk has a weak cluster points which proves (a) i.e. 3a 
subsequence u^ such that Uk> — > u in V as k' — > +oo. Now by 
(//3) and by Proposition ^ 13. li on convex functionals 

(1.25) 7(v) > /(wjf) + J'(uk',v - u^) for any veV and any 

Then, by (H2), J'(uk',v - Uk>) = (G(Uk>), v - up). But here v - uy 
is a bounded sequence and since all the assumptions of Part 1 of 
the theorem are satisfies ||G(m*')II — * i- e - G(u^) — > strongly in 
V. Hence 

\(G(uk'),v - Uk')\ < const. \\G(uk')\\ — > as k' — > +oo 
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and so we find from ( 11.251 ) that 

J(v) > liminf J(uk>) 

k'— >+oo 

or what is the same as saying J(v) > J(u) WveV. Thus u is a 
solution of the Problem P which proves (b). 

(3) The strong convexity of J implies the convexity of J (i.e. H5 im- 
plies H3) and hence by (b) of Part 2 of the theorem the Problem P 
has a solution ueV. Moreover, by Proposition ^ 13. II this solution 
is unique since / is strictly convex. 

Again by (2)(a) of the theorem uu is bounded sequence and has 
a weak cluster points u which is unique and hence — * u as 
k — > +oo. 

(4) Since coercivity of H(u) implies that / is strictly convex (a) is 
just the same as (3)(a). To prove (b) we expand J{u) by Taylor's 
formula: there isa#inO<#< 1 such that 

J{u) - J{llk) + J [Ufa u - u k) + ^ ( Uk + @( U ~ M *)' 11 ~ Uk > 11 ~ U k) 

1 

- J(iik) + (G(uk), u - Uk) + -(H(uk + 9(u - Uk))(u - u^), u - Uk). 
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Here 

\{G{u k ), u - u k )\ < ||G(Mt)||||w - u k \\ Vk 

and 

{H{u k + 6{u - Uk)){u — Uk), u — Uk) > a\\u - Uk\\ 2 Vk. 

These two together with the fact that u is a solution of the Problem 
P imply that 

J(u) > J(u) - HGO^IIIIw - Uk\\ + a/2\\u - u k \\ Vk 

which gives 

\\u - u k \\ < 2/a\\G(u k )\\ V*. 
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But, by Part 1 of the theorem the right hand side here — > as 
k — > and this proves that — > u as k — > +oo. 



2 Generalized Newton's Method 

In this section we give another algorithm for the construction of approx- 
imating sequences for the minimisation problem for functionals J on a 
Banach space V using first and second order G-derivatives of /. This al- 
gorithm generalizes the method of Newton-Rophson which consists in 
giving approximations to determine points of V where a given operator 
vanishes. The method we describe is a refinement of a method by R. 
Fages 1531 . 

We can describe our approach to the algorithm as follows: Suppose 
/ : V — > R is a very regular functional on a Banach space V; for in- 
stance, J has a gradient G{u) and a Hessian H(u) everywhere in V. Let 
74 ueV be a point where J attains its minimum i.e. J(u) < J(v) VveV. We 
have seen in Chapter |2 ^ (Theorem |2 11.31 that G(u) = is a nec- 
essary condition and we have also discussed the question of when this 
condition is also sufficient in Chapter 50 Thus finding a minimis- 
ing sequence for J at u is reduced to the equivalent problem of finding 
an algorithm to construct a sequence approximating a solution of the 
equation: 

(*) ueV,G(u) = 0. 

In this sense this is an extension of the classical Newton method fot 
the determination of zeros of a real valued function on the real line. 

As in the previous section we shall restrict ourselves to the case of a 
Hilbert space V. 

Starting from an initial point u eV suppose we have constructed Uk, 
If Uk is sufficiently near the solution u of the equation G{u) = then by 
expanding G{u) using Taylor's formula we find: 



- {G{u), <p) - {G{uk)) + H{uk + 6{u - Uk)){u - Uk), <p)- 
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The Newton-Raphson method consists in taking u k+ \ as a solution 
of the equation 

G(u k ) + H{u k ){u k +\ - u k ) = for k > 0. 

Roughly speaking, if the operator H(u k )e£f(V, V) = Jz?(V, V) is 
invertible and if H(u k )~ l eJ£'(V, V) then the equation is equivalent to 

u k +i = Uk ~ H{u k )~ l G{u k ). 

Then one can show that under suitable assumptions on G and H 
that this is a convergent algorithm provided that the initial points u is 
sufficiently close to the required solution u of the problem (*). However, 
in practice, u and then a good neighbourhood of u where u is to be taken 75 
is not known a priori and difficult to find. 

The algorithm we give in the following avoids such a difficulty for 
the choice of the initial point u in the algorithm. 

Let V be a Hilbert space and J : V — * R be a functional on V. 
Throughout this section we make the following hypothesis on /: 

(HI) 7(v) -> +oo as ||v|| -> +oo. 

(H2) / is regular: J is twice G-differentiable and has a gradient G{u) 
and a hessian H(u) everywhere in V. 

(H3) H is uniformly V-coercive on bounded sets of F: for every 
bounded set K of V there exists a constant ax > such that 

(H(v)<p, <p) > a^l Ml 2 , VveK and V<peV. 

(H4) H satisfies a uniform Lipschitz condition on bounded sets of V: 
for every bounded subset K of V there exists a constant /3k > 
such that 

\\H(u) - H(v)\\ <{3 K \\u-v\\,Vu,veK. 

We are interested in finding an algorithm starting from a u eV 
to find u k iteratively. Suppose we have determined u k for some 
k > 0. In order to determine u k+ i we introduce a bi-linear bicon- 
tinuous form b k : V X V 3 ((p, tj/) h-> b k (<p, if/)eR satisfying either 
one of the following two hypothesis: 
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(H5) There exist two constants A > O,yu > independent of k, A 
large enough (see MA2V ). such that 

b k (.(p,(f) > A (G(u k ),(p) 2 , e V, 

and 

\b k (tp,^\<iUo\\G(u k )\\M\m, V^e V. 



(H6) There exist two constant A\ > 0,/ii > independent of k, A\ large 
enough see (12.141 . such that 

^^)>^il|G(^)|| 1+e |M| 2 ,V^ey 

and 

\b k {<P,n <Aiil|G(«t)|| 1+6 |MI||^||,V < p,^€ V, 
where e > 0. 

It is easy to see that there does always exist such a bilinear form as 
can be seen from the following example. 

Example 2.1. b k {(p,ip) = A k (G k , <p)(G k , if/), < A < A k < fi < +oo,A 
large enough. 

Example 2.2. b k (<p, tjj) = ^||G /t || 2 (< ! o,i/'),0 < A < A k < /j q < +oo. 
Cauchy-Schwarz inequality shows that (H5) is satisfied by this and (H6) 
is satisfied with 6= 1. 

Example 2.3. Let A k > be a number in a fixed interval < A\ < A < 
fii < +oo then the bi-linear form 

b k ( l p, l f,) = A k \\G(u k )\\ l+c (<p,<J,) 

satisfies (H6). 

We are now in a position to describe our algorithm. 
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Algorithm. Suppose we choose an initial point u in the algorithm ar- 
bitrarily and that we have determined u k for some k > 0. Consider the 
linear problem: 
(2.1) 

to find A k eV satisfying the linear equation 

(H(u k )A k , <p) + b k (A k , tp) = -(G(u k ), <p) = -(G(u k ), <p), VipeV 

Here since H(u k ) is V-coercive and b k is positive semi-definite on 

V: 

i.e. (H(u k )<p,<p)>a\\<pt,V<peV (by (A3)) 
(with a = a(u k ) > 0, a constant) and 77 

b k (<p,<p)>0 (by (H5) or (#6)) 

the linear problem (I2.lt has a unique solution A*eV. 
Now we set 

Uk+\ - u k + A k 

where A k is the unique solution of the problem d2.lt . Clearly, our algo- 
rithm depends on the choice of the bilinear form b k ((p, ifj). We also see 
that if b k = our algorithm is nothing but the classical Newton method 
as we have described in the introduction to this section. 
We have now the main result of this section. 

Theorem 2.1. Suppose J satisfies the hypothesis (H\) - {HA) and b k 
satisfy either the hypothesis (H5) or (H6) for each k > 0. Then we 
have: 

(1) The minimization problem: 

to find ueV, J{u) < J(v), SveV has a unique solution. 

(2) The sequence u k is well defined by the algorithm. 

(3) The sequence u k converges to the solution u of the minimization 
problem: \\u k — u\\ — » as k — > +oo. 

(4) There exist constants j\ > 0, yi > such that 

y\\\u k +\ - u k \\ < \\u k - u\\ < y 2 \\u k+ \ - u k \\, Vk. 
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(5) The convergence of Uk to u is quadratic: there exists a constant 
73 > such that 

\\u k+ i - u\\ < y 3 \\u k - u\\ 2 ,Vk. 

78 In the course of the proof we shall use the notation introduced in the 

previous section: J k , Gk, Hk, AJk, ■ • ■ respectively denote J{uk), G(uk), 
H{u k ),J{u k ) - J(u k+ i), ■ ■ ■ 

Proof. We shall carry out the proof in several steps. 

Step 1. Let U be the subset of V: 

U = {v\veV; J(v) < /(«„)}. 

If there exists a solution u of the minimization problem then u nec- 
essarily belongs to this set U (irrespective of the choice of u ). The set 
U is bounded in V. In fact, if it is not bounded then there exists a se- 
quence uj such that ujelJ, \\uj\\ — » +oo and hence by (H2) and (H3) J 
has a Hessian which is positive definite everywhere. Hence / is strictly 
convex. 

The set U is also weakly closed. In fact, if vjeU and vj — > v in V 
then (strict) convexity of J implies by Proposition dll3.ll ) that we have 

J(u ) > J{vj) > J(y) + (G(v), vj - v) 

and hence passing to the limit (since G(v) is bounded for all j) it follows 
that J(v) < J(u ) proving that veil, i.e. U is closed (and hence also 
weakly). 

Now J and U satisfy all the hypothesis of Theorem 12.11 with 
X(t) = a\jt and hence it follows that there exists a unique ueU solution 
of the minimizing problem for J. We have already remarked that u is 
unique in V. This proves assertion (1) of the statement. 

We have also remarked before the statement of the theorem that the 
linear problem (12.11) has a unique solution which implies that u k+ \ is 
well defined and hence we have the assertion (2) of the statement. 
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Step 2. J(v), G(v) and H(v) are bounded on any bounded subset K of V: 
79 There exists a constant % > such that 

|7(v)| + ||G(v)|| + ||//(v)||< r ^Vv6^. 

In fact let <4 = diamK and let weK be any fixed point. By (//4) we 
have 

#(v) < \\H{v) - H{u)\\ + \\H{u)\\ </3 K d K + ||#(k)|| 

which proves that H is bounded on K. Then by Taylor's formula applies 
to G gives 

\\G(v) - G(u)\\ < \\H(u + 9(v - u))\\\\v - u\\. 

for some < 6 < 1. Now if u, veK then u + 9{v - u) is also in a bounded 
set - {w|weV, d(w, K) < 2dx} (for, if w = u + 9{v - u) and ueK then 
= \\u-a+6{v-u)\\ < ||M-a||+||v-w[| < 2dK)- Since H is bounded 
on K\ it follows that G is uniformly Lipschitz on K and as above G is 
also bounded on K. A similar argument proves J is also bounded on K. 
For the sake of simplicity we shall write 

a - a v ,y = y v . 

Step 3. Suppose u^eU for some k > 0. (This is trivial for k = by the 
definition of the set U). Then is also bounded. 
For this, taking ip = in (12.11) we get 

(2.3) (H k A k , A*) + 2>jt(A 4 , A*) - -(G fc , A fc ). 

By using the coercivity of = H{uk) (hypothesis (H3)) and the fact 
that bidAk, A/t) > we get 

(2.4) allA^I 2 < -(G k , A fc ). 

Then the Cauchy-Schwarz inequality applied to the right hand side of 
i2Al gives 

Suppose < I < +oo be such that sup ueC/ ||G(w)||/Qf < I (for example 80 
we can take t = y/a) and suppose U i is the set 



(2.5) 



U x = {v\veV; 3weU such that ||v - w|| < €}. 
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Then U\ is bounded and u k+ \ = Uk + A k eUi. 
(2.6) Uk+ieUi. 
We shall in fact show later that ut+ieU itself. 
Step 4. Estimate for A/& from below. By Taylor's formula we have 
■4+1 = Jk + (Gk, A k ) + h(HA k , A k ), 



where 

Ht = H(uk + OAk) for some 9 in < 9 < 1. 



(2.7) 

Replacing A*) in (12.7ft by <I2.3I > we have 



A+l = Jk~ (H k A k , A k ) - b k {A k , A k ) + ~{HA k , A k ) 

1 1 — 

= J k - -(H k A k , Ak) - b k (A k , A k ) + -((H k - H k )A k , A k ). 

Now using V-coercivity of Hk (hypothesis (H3)) and the Lipschitz con- 
tinuity (hypothesis (H4)) of H on the bounded set U\ we find (since 
Uk + 9AkeU\): 

J k+ \ <Jk~ a/2\\A k \\ 2 - b k (A k , A k ) + l -$u, llA^II 3 . 
Thus setting 

(2-8) fi=fi Vi 
we obtain 

(2.9) a/2\\A k \\ 2 + b k (A k , A k ) - j/3\\*k\\ 3 < AJ k (= J k ~ Jk + i). 
In particular, since b k is positive (semi -) definite, 

(2.10) ar/2||A*|| 2 (l -/3/a\\A k \\) < Aj k 

81 In the methos of Newton-Rophson we have only ( I2.10I ). 
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Step 5. Aj k is bounded below by a positive number: if < C < 1 is any 
number then we have 

(2.11) aC/2\\A k \\ 2 <Aj k . 
To prove this we consider two cases: 

(i) ||Afc|| is sufficiently small, i.e. ||Ajt|| < (1 - C)a//3, and 

(ii) ||A t || large, i.e. ||a*|| > (1 - C)a//3. 

If (i) holds then (12. 1 U is immediate from (12. 10ft . Suppose that (ii) holds. 
By hypothesis (H5) and by d23t : 

b k (A k ,A k ) > A (G k ,A k ) 2 > /lo£^||Ai|| 4 

Then from (I2.9t we can get 

a/2\\A k \\ 2 + A a 2 \\A k \\ 4 -p/2\\A k \\ 3 < Aj k 
i.e. or/211 A k \\ 2 + ^ > ar 2 ||A Jk || 3 (||A Jk || - /3/(2A )a 2 ) < Aj k . 

If we take 

(2.12) A >/3 2 /(2a\l-C)) 

then we find that ||a*|| > (1 - C)a/p > /3/(2A a 2 ) and hence 

(2.13) a/2\\A k \\ 2 <Aj k . 

Since < C < 1 we again get ( 12. lit from (12.13b . Suppose on the other 
hand (ii) holds and b k satisfies (H6) with a A\ to be determined. Again 
from (I2.9t . (12.51 and hypothesis (H6) we have 

a/2\\A k \\ 2 + ^||G*|| 1+e ||A*H 2 -M2a)||A*|| 2 ||G*|| < A/ t 
i.e. a/2\\A k \\ 2 + Ax ||G*||||A*|| 2 (||G*H e - p/{2aA)) < Aj k 

Using (ii) together with (12.51) we get 82 

a6(1 " C) V < « e iiA,ir < iiG fc ir 
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so that if a 2e (l - C) e //3 e > p/2aA\ then we can conclude that 

a/2\\A k \\ 2 < Aj k . 
This is possible if A\ is large enough: i.e. if 

(2.14) Xi = (3 1+£ /2a 1+2£ (l - Cf. 

As before since < C < 1 we find the estimate (12.1 U also in this case. 

Step 6. Jk = J(uk) is decreasing, Uk+ieU and ||Ajt|| — > as k — » +oo. 
The estimate ( 12.1 11 1 shows that 

Jk ~ Jk+\ = &Jk > 0, 

which implies that is decreasing. On the other hand, since u is the 
solution of the minimization problem we have 

J(u) < Jk+l < Jk, 

which shows that Uk+ieU since J(uk+i) < < J(u ) since u^eU . 

Thus Jk is a decreasing sequence bounded below (by /(«)) and hence 
converges as k — > +oo. 
In particular 

A/jt = Jk~ Jk+l - an( l A -//t — > as /c ^ +oo. 
Then, by d2~TTl 

(2.15) ||Afc|| -» as jk -» +oo 

Step 7. The sequence u\ converges (strongly) to u, the solution of the 
minimization problem. In fact, we can write by applying Taylor's for- 
mula to (G, (p), for tpeV, 

(G k , <p) = (G(u), ip) + (H k (u k - u), ip) 
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where 

H k = H(u + 6(u k - u)) for some 9 V in < 9 < 1 . 

But here G(u) = 0. Now replacing (G#, tp) by using ( 12. Il l defining A* we 
obtain 

(2.16) (H k A k , tp) + b k (A k , tp) = -(H k (u k - u), tp), VtpeV. 

We take tp = u k - u in A2.16I ). Since U is convex and since u, u k eU it 
follows that u + 0(u k - u)eU . By the uniform V-coercivity of H we know 
that 

{H k (u k - u), u k - u) > a\\u k - u\\ , a = a u . 

Applying Cauchy-Schwarz inequality to the term -(H k A k , u k - u) and 
using the fact that H k is bounded we get 

\(H k A k , u k - u)\ < y u \\A k \\\\u k - u\\. 

Then (12.161) will give 

a\\u k - u\\ 2 < y\\A k \\\\u k - u\\ + \b k (A k , u k - u)\. 

On the other hand, ||G(«i)|| is bounded since u k eU. Let d = max(ju 
\\G(u k )\\ 2 ,fii\\G(u k )\\ l+£ ) < +oo. The hypothesis (H5) or (H6) together 
with the last inequality imply 

a\\u k -u\\ 2 <(r + <OI|Ajt||||Bt-B||, 

i.e. 

(2.17) ll«*-«ll^(r + */a||A*ll 

Since ||A^|| — > as k — > +oo by (I2.15t we conclude from (12. 17ft that 84 
u k — » u as k —> +oo. Next if we take tp = A k in d2.16t we get 

(H k A k , A k ) + b k (A k , A k ) = -(H k (u k - u), A k ). 

Once again using the facts that b k is positive semi-definite by (H5) or 
(H6) and that H k is V-coercive by (Hi,) we see that 

allA^I 2 < \\u k - u\\\\Ak\\ 
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since H\ is bounded because u + 6{uk - u)eU for any 9 in < 6 < 1 i.e. 
we have 

(2.18) ar/rllAjtll < || Mjt - n||. 

d2.17t and ( 12.181 ) together give the inequalities in the assertion (4) of the 
statement with y\ - a/y, j2 = {j + d)/a. 

Step 8. Finally we prove that the convergence u k — > u is quadratic. If 
we set 6k - Uk -u then Aj. = 6t+i - 6k and 02.16b can now be written as 

(H k 6 k+l , cp) + b k {6 k+ i , <p) = (H k 6 k , if) + b k (6 k , tp) - (H k 6 k , ip) 

= ((H k - H k )6 k , ip) + b k (6 k , <p). 

Here we take ip = 6 k+ i. Applying V-coercivity of H k (hypothesis H3), 
using positive semi-definiteness of b k on the left side and applying 
Cauchy-Schwarz inequality to the two terms on the right side together 
with the hypothesis {HA) to estimate \\H k - H k \\ we obtain 

(2.19) a\\6 k+l \\ 2 < \\H k - H k \\\\6 k+l \\ + \b k (6 k ,6 k+l )\ 

<m\\ 2 \\Sk + i\\ + \b k (6 k ,6 k+l )\. 

85 But, by (H5). 

(2.20) \b k (6 k ,6 k+l )\ < fio\\G k \\ 2 \\6 k \\\\6 k+1 \\. 

On the other hand, by mean-value property applied G we have 

\\G k -G{u)\\<y\\u k -u\\ 
since for any well, \\U(w)\\ < y. As G(u) = this implies that 

(2.21) IIG*||<yll«*-«ll = yll<y. 

Substituting this in the above inequality ( 12.191 ) 

II 2 < /3\M¥k + i\\ +/ioy 2 ll4ll 3 lfe + ill. 
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Now dividing by \\S k+ i II and using the fact that ||^|| = \\u k - u\\ < diamll 
we get 

\\S k+l \\< a - l (J3 + v y 2 \\S k \\)\\5k\\ 2 

<a~ l (J3 + /j y 2 diamU)\\5 k \\ 2 

which is the required assertion (5) of the statement with 73 = a~ l (J3 + 
fi y 2 diamU). 

If we had used hypothesis (H6) instead of (H5) to estimate \b k (6 k , 
<5&+i)l we would get 

(2.20)' M6k,Sk*i)\ZH\\G k \\ 1+e \M\Vl*i\\ 

in place of (I2.20I I. Now by ( 12.191 1 together with ( 12.21ft gives (exactly by 
the same arguments as in the earlier case) 

Ife+lll < a- l (fi + ^%diamUf)\\8 k \t 

In this case, we can take 73 = a~ l (J3 + fi\y i+(; (diamUy). 

This completely proves the theorem. □ 

We shall conclude this section with remarks. 86 

Remark 2.1. In the course of our proof all the hypothesis (HI) - (H5) 
or (H6) except {HA) have been used only for elements v in the bigger 
bounded set U while the hypothesis (H4) has been used also for ele- 
ments in the bigger bounded set U\ . 

Remark 2.2. As we have mentioned earlier the proof of Theorem 12.11 
given above includes the proof of the classical Newton-Rophson method 
if we make the additional hypothesis that u is close enough to u such 
that Vve?7 we have 

-IIGOOH < -d, 

a p 

d given in ]0, 1[. Then using (12.5b . (12. 10ft becomes 



(\ - d)-\\^ k f < Lj k . 
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Remark 2.3. 

Example 2.4. Let V = W. Then G k e(W)' = R". If we represent an 
element </>eR" as a column matrix 



¥> 



<Pr, 



trhen </><// (with matrix multiplication) is a square matrix of order n. In 
particular G k G' k is an (n X n) square matrix. Moreover under the hypoth- 
esis we have made H k + AG k G' k is a positive definite matrix for A > 0. 
This corresponds to bk(sp, <A) - A(G{(p, G k iff)' = A(GkG,(p, ip) and our 
linear problem (12.11) is nothing but the system of ^-linear equations 

(H k + AG k G[)A k - -G k 

in n-unknowns A k . 

Example 2.5. Simiarly we can take b k (ip, if/) = A\\G k \\ (<p, if/), and we get 
(H k + A\\G k \\ 2 I)A k - -G k . 



Example 2.6. We can take b k (<p, (A) = /l||G^|| 1+e (^, if/) and we get 

(H k + A\\G k \\ 1+£ r>A k = -G k 
as the corresponding system of linear equations. 



Remark 2.4. The other algorithms given in this chapter do make use 
only of the calculation of the first G-derivative of / while the Newton 
method uses the calculation of the second order derivatives (Hessian) 
of J. Hence Newton's method is longer, more expensive economically 
than the methods based on algorithms given earlier. 
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3 Other Methods 

The following are some of the other interesting methods known in the 
literature to construct algorithms to approximate solutions of the mini- 
mization problems. We shall only mention these. 

(a) Conjugate gradient method: One of the algorithms in the class 
of these methods is known as Devidon-Fletcher-Powell method. 
Here we need to compute the G-derivatives of first order of the 
functional to be minimized. This is a very good and very much 
used method for any problems. (See [11] and [ 15 ]). 

(b) Relaxation method: In this method it is not necessary to compute 
the derivatives of the functionals. Later on in the next chapter we 
shall give relaxation method also when there are constraints. (See 
ChapterlU §4~5j. 

(c) Rosenbrock method. (See, for instantce, 1 30 1). 

(d) Hooke and Jeeves method. (See for instance [30]) 

Also for these two methods we need not compute the derivatives of func- 
tionals. They use suitable local variations. 



Chapter 4 



Minimization with 
Constraints - Algorithms 

We have discussded the existence and uniqueness results for solutions 88 
of the minimization problems for convex functionals on closed convex 
subsets of a Hilbert space. This chapter will be devoted to give algo- 
rithm for the construction of minimizing sequences for solutions of this 
problem. We shall describe only a few methods in this direction and we 
prove that such an algorithm is convergent. 

1 Linearization Method 

The problem of minimization of a functional on a convex set is also 
some-times referred as the problem of (non-linear) programming. If the 
functional is convex the programming problem is call convex program- 
ming. 

The main idea of the method we shall describe in this section con- 
sists in reducing at each stage of iteration the problem of non-linear 
convex programming to one of linear programming in one more vari- 
able i.e. to a problem of minimizing a linear functional on a convex 
set defined by linear constraints. However, when we reduce to this case 
we may not have coercivity. However, if we know that the convex set 
defined this way by linear constraints is bounded then we have seen in 
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Chapter|2that the linear programming problem has a solution (which is 
not necessarily unique). 

Then the solution of such a linear programming problem is used to 
obtain convergent choices of w and p. 

Let V be a Hilbert space and K a closed subset of V. We shall pre- 
scribe some of the constraints of the problem by giving a finite number 
of convex functionals 

//: V B v i-> /<(v)eR, i = ,k, 

89 and we define a subset U of K by 

U = {v\veK,Ji(v) < 0, i = 1, • • • ,} 

Then U is again a convex set in V. If v, v'eU then v, VeK and (1 - 
6)v + Ov'eK for any < 6 < 1 since A' is convex. Now /,■('' - 1, • • • , k) 
being convex we have 

7 f ((l - 0)v + 0v') < (1 - fl)/i(v) + ft/iOO < 0, j = 1, • • • ,k. 

We note that in practice, the convex set K contains (i.e. is defined 
by) all the constraints which need not be linearized and the constraints 
to be linearized asre the /,•(/ = 1, • • • , k). 

Suppose now 

Jo : v 3 V -> / (v)eR 
is a convex functional on V. We consider the minimization problem: 

Problem 1.1. To find ueU, J (u) < J (v), VvelJ. We assume that J , J\, 
. . . ,Jk satisfy the following hypothesis: 
Hypothesis on J : (HJ) . 

(1) 7 (v) — > +oo as ||v|| — * +<x> 

(2) J is regular: J Q is twice differentiable everywhere 

in V and has a gradient G and a hessian H everywhere in V 
which are bounded on bounded subsets: for every bounded set U\ 
of V there exists a constant My, > such that 



||Go(v)|| + ||tf (v)|| ZMuyveUi. 
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(3) H is uniformly V-coercive on bounded subsets of V: for every 
bounded subset U\ of V there exists a constant a>u l > such that 

(H (v)(p, if) > at/JMI 2 V^eV and Vvet/i. 

90 

Hypothesis on /,.(///),: 

(1) ; /; is regular : is twice G-differentiable everywhere in V and has 

a gradient G, and a hessian //, bounded on bounded sets of V: for 
every bounded set U\ of V there exists a constant Mu l > such 
that 

l|G;(v)|| + ||ff,-(v)|| < M Vl VveUui = 1, • • • X 

(2) i Hj(v) is positive semi-definite: 

(Hi(v)<p, (p)>0 VyeVQfveUi). 



Hypothesis on K.{HK): There exists and element ZeK such that /,(Z) < 
for all / = l, -- ,k. 

The hypothesis in particular implies that U + (p. 

In order to describe the algorithm let u eU be the initial point (ar- 
bitrarily fixed) of the algorithm. In view of the hypothesis (HJ) (\) we 
may, without loss of generality, assume that U is bounded since other- 
wise we can restrict ourselves to the set 

\veU;J (v) < J {u)} 

which is bounded by (//J),(l). So in the rest of our discussion we as- 
sume U to be bounded. 

Next, by hypothesis (//7),(1), the bounded convex set U is also 
closed. In fact, if v n ell and v n — > v then since K is closed, veK. More- 
over, by the mean value properly applied to /,-(/ = 1 , • • • , k) we have 

Ui(Vn) ~ Ji(v)\ < \\GM\Vn ~ V|| 

so that Ji(v n ) —> Ji(v) and hence /,(v) < for i - 1, • • • , k i.e. veil. 

Let V be a bounded closed convex subset of V which satisfies the 91 
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condition: there exist two numbers r > and d > 0, ehich will be 
chosen suitably later on, such that 

B(0,r) c V c B(0,d) 

where B(0,t) denotes the ball {veV|||v|| < t] in V(t = r,d). Consider the 
set 

U\ = {v\veV; 3weU such that ||v - w\\ < d}. 

Since U is bounded the set U\ is also bounded and U\ D U. In the 
hypothesis (HJ) and (HJ)j we shall use only the bounded set U\ . 

We shall use the following notation : Jj(u m ),Hi(u m ) will be respec- 
tively denoted by 7 ; m , Gf , Hf 1 for i = 0, 1, • • • , k and all m > 0. 

Now suppose that starting from u ell we have constructed u m . We 
wish to give an algorithm to obtain u m+ \. For this purpose we consider 
a linear programming problem. 

A linear programming problem : Let U m denote subset of U x R defined 
as the set of all (z, cr)eU x R satisfying 

z - u m ef, 
■ (G™,z-u m ) + o-< 0, and 

Jf + (Gf ,z -u m ) + o- < for i = 1, • • • , it. 

It is easy to see that C/ m is a nonempty closed convex bounded set: 
In fact, (z,cr) = (u m ,0)eU m so that U m ± <p. If (z,cr)eU m then since 
z—u m eV, which is a bounded set it follows that z is bounded. Then using 
the other two inequalities in (ll.lt it follows that cr is also bounded. If 
(Zj, o-j)eU m and (z;, cr ; ) — > (z, cr) in [7 x R then since t/ is closed zeU 
and hence (z, cr)f?7 x R. Again since ^ is closed (z - u m )eY. By the 
continuity of the (affine) functions 

(Z,CT) ^ JJ» + (Gf,Z-ll M ) + (T 

(z,cr) (G'",z-u m ) + o- 

92 we find that 

/f + (Gf , z - Mm ) + cr < 0, (G'",z -u m ) + cr< 0. 
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Finally to prove the convexity, let (z, cr), (zf, cr')eU m . Then, for any, 
O<0< 1, 

(1 - 0)z + Gz' - u m = (1 - G)(z - u m ) + &(z' - u m )eY 

since "V is convex. Moreover, we also have 

(G"\ (1 - ff)z + Gz' - u m ) + (1 - G)cr + Go-' 
= (1 - G)[(G'l\z- u m ) + cr] + G[(G™,z' - u m ) + cr'] < 

and similarly 

Jf + {Gf, {-6)z + Gz - u m ) + (1 - G)cr + Go-' < 0. 

Next we consider the functional g : V x R — > R given by ,g(z, cr) = cr 
and the linear programming problem : 

(P m ) : to find (z m , cr m )eU m such that g(z m , cr m ) > g(z, cr), V(z, cr)eU m . 

i.e. 

Problem P OT : To find (z m , cr m )eU m such that 

(1.2) cr < cr m for all (z, cr)eU m . 

By the results of Chapter we know that the Problem P m has a 
solution (not necessarily unique). 

We are now in a position to formulate our algorithm for the con- 
struction of u m+ \. 

Algorithm. Suppose we have determined u m starting from u . Then we 93 
take a solution (z m ,cr m ) of the linear programming Problem (P m ). We 
set 

(1.3) W m = (Zm - U m )l\\z m - U m \\ 

and 

(1.4) p e m = max{peR, u m + pw m eU}. 

We shall prove later on that w m is a direction of descent. We can 
define the notions of convergent choices of w m and p in the same way 
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as in Chapter[3] Sectionfflfor the functional J . We shall therefore not 
repeat these definitions here. 

Let p c m be a convergent choice of p for the construction of the mini- 
mizing sequence for J without constraints. We define 

(1.5) p m = mm(p c m ,p e m ) 
and we set 

(1.6) u m+ \ = u m + p m w m . 
The following is the main result of this section. 

Theorem 1.1. Suppose that convex set K and the Junctionals J ,J\, 
. . . , Jk satisfy the hypothesis (HK) and (HJ)i, i - 0, 1, • • • , k. Suppose 
(I) the Problem Al.lt has a unique solution and (2) u m — > uas m — > +oo. 

Then the algorithm described above to determine u m+ \ from u m is 
convergent. 

i.e. If ueU is the unique solution of the Problem (11.lt and if Um is a 
sequence given by the above algorithm then J(u m ) — > J(u) asm-> +oo. 

For this it will be necessary to prove that w m is a direction of descent 
and w m ,p m are convergent choices. 

The following two lemmas are crucial for our proof of the Theorem 

o 

Let ueU be the unique solution of the Problem ITTTI 

Lemma 1.1. Let the hypothesis ofTheorem \l.l\ be satisfied. If, for some 
m>0we have J (u) < J (u m ) then there exists an element (y m , e m ) e U m 
such that e m > 0. 

Proof. Let u m e U be such that J (u) < J (u m ). We first consider the 
case where Z ± u,Z being the point of K given in hypothesis (HK). We 
introduce two real numbers i m , £' m such that 

J (u) <l' m <l m < J (u m ) and i' m < J (Z). 
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Let / = I(u, Z) denote the segment in V joining u and Z, i.e. 

/ - {w\weV; w = (1 - 9)u + 6Z,0 < 9 < 1} 

Since u, Z belong to the convex set U we have I c U. 
On the other hand, if cel. is any constant then the set 

Joe = {veU;J (v) < c] 

is convex and closed. For, if v, v'eJ 0C then for any, < A < 1, 

/ ((1 - A)v + Av') < (1 - A)J (v) + AJoW) < c 

by the convexity of / and (1 - A)v + AV eU since U is convex. To see 
that it is closed, let vj e J oc be a sequence such that vj — > v in V. Since 
U is closed v e U. Moreover, by mean value property for / c 

\J (vj) - Uv)\ < Mu\\vj - v\\ < M^Wvj - v|| 

by Hypothesis (HJ) (2) so that J (vj) —> J (v) as j — > +oo. Hence 
/ (v) < c i.e. v e / oc - 

Now by the choice of €\ m , uel n J (> m and hence I = I n J f m + <f>. 95 
It is clearly closed and bounded. I being a closed bounded subset of a 
compact set / is itself compact. 

Now the function g : I Q —> R defined by g - J /I is continuous: In 
fact, if w, w'eh then by the mean value property applies to J gives 

\g(w) - g(w')\ = \J (w) - J (w')\ < Mf/Jlw - w'\\ 

by hypothesis (HJ) (2). Moreover, by the very definition of the set 
I c J f m we have 

\g{w)\ < C- 

Hence g attains its maximum in I i.e. There exists a point y m el 
such that g(y m ) - J (y m ) - Cr i- e - there exists a 6 m , < 6 < 1 such that 

y m = (1 - 6» m )M + 6 m Z, J (y m ) = C- 

Since / («) < ^ we see that y m + u and therefore 6 m ^ 0. i.e. 
< 6 m < 1. 
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Next we show that 7,(y m ) < for all i - 1, • • • , k. In fact, since /, is 
convex and has a gradient G; we know from Proposition 13. II of Chapter 
□that 

Ji(y m )>J™ + (G'J\y m -u m ) 

and we also have 

Jfy m ) < (1 - G m )Ji(u) + 9 m Ji(Z) < 

since < 8 m < 1 and 7,(Z) < 0. 

Similarly, by convexity of J we get 

f m = Jo(y m ) ^ J"' + (G™,y m - Um) ^ + (G™,y m - u m ) 
i.e. (G™,y m - u m ) <i' m - € m < by the choice of € m , C m 

We can now take 

e m = min{C - €' m , -Ji(y m ), ■■■ , -Jifym)} > 0. 

Then it follows immediately that (y m , e m ) € U m and e m > 0. 
We now consider the case u - Z. Then we can take y m = Z - u and 
hence 7;(y m ) - //(«) = J,(Z) < 0. It is enough to take 

e„,= min{7 (M m ) - J (u), -J t {Z), • • • , -Jk(Z)} > 0. 

If we now take r > sufficiently large then y m - u m € "¥ . This is 
possible since both y m and « m are in bounded sets: 

\\y m \\ < (l - fl m )l|n|| + e m ||Z|| < |N| + ||Z|| 

so that 

Ibm - "mil < \\y m \\ + \\Um\\ < INI + II^H + ||M m ||. 

It is enough to take r > ||u|| + ||Z|| + \\u m \\ > 0. Thus (y m , c m ) e ^. 

Corollary 1.1. Under the assumptions of Lemma 17.71 there exists a 
strongly admissible direction of descent at U m for the domain U. 
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Proof. By Lemma fTTTI there exists an element (y m , e m ) e U m such that 
e m > 0. On the other hand, let (z, m , o~ m ) be a solution in £/ m of the Linear 
programming problem (P m ). Then necessarily cr m > e m > and we can 
write 

(1 .7) - 7- + (Gf , z m - u m ) + e m < J? + {Gf, z m - u m ) + cr m <0 

(G' ", z m ~ u m ) + e m < (G' n , z m -u m ) + cr m <0 

Thus we have 

(1.8) (G" ',zm-u m ) < -e m <0, 
and hence 

(1.9) W m = (Z m ~ U m )l\\z m ~ U m \\ 

is a direction of descent. It is strongly admissible since U is convex and 
we can take any sequence of numbers e ; - > 0, e ; - — > 0. □ 

Lemma 1.2. Le£ f/ie hypothesis of Theorem 17.71 /zoM and, for some 
m > 0, 7 (w) < J (u m ). If (z m ,cr m )eUm is a solution of the linear pro- 
gramming problem (P m ) then there exists a number > depending 
only on e m of Lemma U . l\ such that 

(1.10) u m + p(z,m ~ u m )eU for allO <p < fi m . 
Furthermore, 

(G™, z m - u m ) < 0. 

Proof. We have alredy shown the last assertion in the Corollarv ll.ll and 
therefore we have to prove the existence of p m such that dl.101 ) holds. 
For this purpose, if p > 0, we get on applying Taylor's formula to each 
Ji(i - l,-- - ,k): 
(1.11) 



1 2- 



■III , 



Jj(u m +p(Z m ~ Um)) - Jj +P(G { , Zm ~ U m ) + -p (H { (z m ~ U m ), Zm ~ U m ) 

where 

H^ 1 = H'"(u m + p'(zm - u m )) for some < p < p. 
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Here, A\z m -u m \\ < d and hence u m +p'(z m -u m ),(Q < 

p' < p) belongs to U\ if we assume p < 1. \\H t || is bounded by M\j x and 
so we get 

(1.12) Ji(u m +p(z m - u m )) < J™ +p(G™,z m ~ u m ) + ^Mp 2 d 2 . 

Thus if we find a p m > such that < p < fj. m implies the right 
hand side of this last inequality is < forall i = 1 , • • • , k then u m + 
P(Zm ~ u m )eU. 

Using the first inequality (11.71) to replace the term (G™,z m - u m ) in 
d!.12t we get 

(1.13) Ji(u m + p{z m - u m )) < Jf + p(-Jf -e m ) + jp 2 Md 2 . 

The second degree polynomial on the right side of d 1 . 1 3I > vanishes 

for 

(1.14) P =p7 = l(JT + £m) + {(Jf + e m ) 2 - 2Md 2 J?}h/Md 2 . 
Moreover the right side of ( 11.131 is smaller than 

jm +p( _jm ) + ^ p 2 Md 2 

since e m > 0, p > and this last expression decreases as p > decreases 
as -Jf = -Ji(u m ) < 0. Then it follows that, if < p < pf, we have 

Ji(u m +p(Zm ~U m ))< 0. 

We can now take fj. m = min^™, • • • ,p™) also that we will have 

Ji(u m + p{z m - u m )) < for all < p < yu m and i - 1, • • • ,k 

But each of the pf gives by (11.141) depend on and hence on u m . 
In order to get a p. > independent of u m and dependent only on e m we 
can proceed as follows. If we set 

(1.15) <p(y) = [(y + e m ) + {{y + e m ) 2 - IMd^/Md 2 
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for y < then, since y = Ji(u m ) = Jf < 0, we can write 

It is easily checked that the function tp :] - oo, 0] — > R is continuous, 
vCy) > f° r all y < and lim ip(y) = 1. Hence inf (p(y) = 77(6,,,) exists 

y—t-oo y<0 

and ?7(£ m ) > 0. 99 
We choose /i m - rj(e m ). Then, if < p < yu m < p™ for each / - 

1, • ■ • ,k given by J1.14i and consequently, for any such p > 0, w m + 

p(x m - u,„)eU. 

We are niw in a position to prove Theorem ll.il 

Proof of Theorem 1.1. We recall that (z m ,cr m )eU m is a solution of the 
linear programming problem (P m ) and 

~ {Zm ~ Um)l\\Zm ~ u m\\i 

Pm = mm{p { m ,p m ), 

Then J (u m ) is a decreasing sequence. In fact, if p m = p c m then by 
definition of p c m we have J (u m+ i) < J (u m ). Suppose p m - p e m < p c m . 
If J (u m + p c m w m ) < J (u m + p c m w m ) there is nothing to prove. So we 
assume J% m > Jo'". Consider the convex function p \-> J(u m + pw m ) in 
[0,p m ]. It attains its minimum at p = p m i n e]0,p m [. Then < p m < p m i n . 
In fact, if p m i n < p m < p c m then since J , being convex, is increasing in 
[p m i n ,p m ] we have Jo" < J,„ contradicting our assumption. Once again 
since J is convex J is decreasing in [0,p m i n ]. Hence J" 1 = J {u m ) > 
Jo m = Jo(u m+ i). Since we know that there exists a (unique) solution u 
of the minimizing problem fl.il we have J (u m ) > J (u),^m > 0. Thus 
J (u m ), being a decreasing sequence bounded below, is convergent. Let 
I - lim J (u m ). Clearly £ > J (u). Then there are two possible cases: 

(1) £ = J (u) and 



(2) t > J {u). 
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Case (1). Suppose J {u m ) — > I = J {u). Then, for any m > 0, we 
have by Taylor's formula : 

1 - 

J (u m ) = J (u) + (G (u), u m -u) + -(H m (u m - u), u m - u). 

100 where 

H m = H (u + 9(u m - u)) for some < 9 < 1 

Since u,u m e U (which is convex), u + 9{u m - u) e U ofr any 
< 9 < 1 and hence by hypothesis (HJ) (3) 

(H m (u m - u), u m - u)> a\\u m - u\\ 2 , a - au t > 0. 

Moreover, since J is convex, we have by Theorem 12 .21 of Chapter|2 

(G (u), u m - u) > 0. 

Thus we find that 

1 2 
J (u m ) > Jo + -a\\u m - u\\ 

i.e. \\u m - m|| 2 < 2/a{J {u m ) - J a (u)). 

Since J (u m ) — » J (u) as m — > +oo it then follows that u m — > w as 
m — > +oo. 

Case(2). We shall prove that this case cannot occur. Suppose, if 
possible, let J (u) < t < J (u m ),Vm > 0. We shall show that the 
choices of w m and p m are convergent for the problem of minimization 
of J without constraints, i.e. the sequence u m constructed using our 
algorithm tends to an absolute minimum of J in V which will be a 
contradiction to our assumption. 

w m is a convergent choice. For this we introduce, as in the proof of 
Lemma [TTTI another real number I' such that 

Joiu) <£'<■£< J (u m ), ~im > 0. 

Then the proof of Lemma fTTTI gives the existence of (y, e) € U m with 

101 e m =€> Vm > 0. On the other hand, (z m , cr m ) e U m being a solution of 
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the linear programming problem (P m ) we have cr m >e> 0. Hence from 
dl.7l > we get 

\ Jf + {G™,z m -u m ) + e<0. 



(1.16) 



From the first inequality here together with the Cauchy-Schwarz in- 
equality gives 

-IIGo lllkm - U m \\ < {G m o , Zm - U m ) < -€ 

i.e. £ < ||G™||||z m - u m \\ < M\\ Zm - u m \\,M = M Vl , 
using hypothesis (HJ) (2). So we have 

(1.17) \\z m -u m \\>e/M>0. 

By Lemma [l~^l there exists a fi = 77(e) > such that 
(1.10) u m + p(z m - u m ) £ U if < p < 77(e). 

If we denote by p,p = p\\(z m - u m )\\ then this is equivalent to saying 

that 

u m + pw m € U if < p < n(e)\\z m Mm 1 1 ■ 



Then, in view of dl.l7t . < p < en(c)/M implies < p < 77(c) \\z m - 
u m \\ and hence 

u m + pw m € U for all < pen(e)/M, 

which means that 

p m > erj(c)/M. 



Once again from dl.l6t we have 

(G™ wj < -€/\\zm-Um\\ < -eld 

because z m - u m € f by ( ll.lt meancs that ||z m - w m || < d. Since ||G"'|| < 
M we obtain 

(G: i /\\G" '\\,w m ) < -e/d\\G™\\(< -elMd). 
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Taking e > small enough we conclude that 

(G™/\\G™\\,w m ) < -Ci < 0, 1 > C\ > being a constant. This 
is nothig but saying that the choice of w m is convergent for the mini- 
mization problem without constraints by w- Algorithm 1 of Section 11.21 
of Chapter 

p m is a convergent choice. Since p m = ram{p m ,p c m ) we consider two 
possible cases 

(a) If p m = p c m then there is nothing to prove. 

(b) Suppose p m = p m . We shall that this choice of p m is also a con- 
vergent choice. For this let ci be a constant such that < C2 < 

l c 
Pm = P m — Pm- 

Then < p m /p m - 1 an d we can write 

«m+l - U m +p m W m = (1 ~ PmlPm)u m + Pm/P m ( u m + Pm w m)- 

The convexity of J then implies that 

Jo(u m+ \) < (1 ~ Pml P m )J(u m ) + Pml P m Jo{u m + p m W m ). 

Hence we obtain 

Aj — J oiUm) J (u m + Pm^m) ~ J o(Mm) J o(Mm+\) 
^ PmlP m { J o{u m ) ~ h(u m + p m W m )) 

i.e. 

(1-18) AJ P " >p m /p m Aj$' 

We note that p m is necessarily bounded above for any m > 0. For 
otherwise since, we find from triangle ineuality that 

\\u m +p c m w m \\ > p m \\w m \\ - \\u m \\ = p c m - \\u m \\. 

103 u m + p c m w m would be unbounded. Then by Hypothesis (HJ )(\)J (u m + 
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P c m w m ) would also be unbounded. This is not possible by the definition 
of convergent choice of p c m . 

Let C3 be a constant such that < p c m < C3 for all m > 0. Then 
d 1 - 1 8I > will give 

(1.19) AJ P - >C 2 /C 3 aJ^ 

Hence if A Jo'" -> then Aj„"' ^ by (ITT91 . By the definition of 
p c m (as a convergent choice of p) we have 

(G m , w m ) — * as m — » +00 

which means that p m is also a convergent choice of p. 

Finally, since the choices of p m , w m are both convergent for the min- 
imization problem without constraints for J we conclude using the re- 
sults of Chapter 13 that where u is the global minimum for J 
(which exists and is unique by results of Chapter Theorem 12.11 of 
Sectional)- Thus we have 

J (u) < J (u) <€ < J (u m ) 

and J (u m ) — > J (u) 

which is impossible and hence the case (2) cannot therefore occur. 
This proves the theorem completely. 
We shall conclude this section with some remarks. 

Remark 1.1. A special case of our algorithm was given a long time ago 
by Franck and Wolfe [ 17 ] in the absence of the constraints /, which we 
have linearized. More precisely they considered the following problem: 
Let 7 be a convex quadratic functional on a Hilbert space V and K 
be a closed convex subset with non-empty interior. Then the problem is 104 
to give an algorithm for finding a minimizing sequence u m for 

ueK, J (u) - inf J (v). 

veK 
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The corresponding linear programming problem in this case will be 
the following: 



Since K itself can be assumed bounded using hypothesis (HJ) (\) 
there is no need to introduce the bounded set V. When z = z m we have 

(Go, Z m -U m ) + (T< (G™, Z m ~ U m ) + (T m < VcreR 

i.e. min(G™,z m - u m ) + cr < 0. 

The algorithm given by Franck and Wolfe was the first convex pro- 
gramming algorithm in the literature. 

Remark 1.2. Our algorithm is a special case of a more general method 
known as Feasible direction method found by Zoutendjik [52]. 

Remark 1.3. We can repeat our method to give a slightly different al- 
gorithm in the choice of z m as follows. We modify the set U m used in 
the linear programming problem (P m ) by introducing certain parameters 
Jo, yi, • • • ,7k with cr. More precisely, we replace (11.11) by 



where y Q , y\, • • • , jk are certain suitably chosen parameters. This modi- 
fied algorithm is useful when the curvature of the set U is small. 

Remark 1.4. Suppose, in pur problem 11.11 some contraint 7, is such 
that Ji(u m ) = J™ is "sufficiently negative" at some stage of the iteration 
(i.e. for some m > 0). Since 7 ( is regular then 7 ( (v) < in a sufficiently 
small" ball with centre at u m . This can be seen explicitely using Taylor's 
formula. Thus we can ignore the constraint 7, in the formulation of our 
problem i.e. in the definition of the set U. 



U m =K m = {{z, cr)eK X R(G™, z - u m ) + cr < 0(, 
To find (z m , o- m )eK m such that cr m = max ( , iCr)e ^ m cr. 



(1.1)' 



Z - Umef 

(G™,z - u m ) + JoO- < 0, and 

Jf + (Gf,z- u m ) + jitr < Ofor i = 1,- •• ,k, 
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Remark 1.5. The algorithm described in this section is not often used 
for minimizing problems arising from partial differential equation be- 
cause the linear programming problem to be solved at each stage will be 
very large in this case. Hence our method will be expensive for numeri- 
cal calculations for problems in partial diffeerential equation. 

2 Centre Method 

In this section we shall briefly sketch another algorithm to construct 
minimizing sequences for the minimizing problem for convex function- 
als on a finite dimensional space under constraints defined by a finite 
number of concave functionals. However we shall not prove the conver- 
gence of this algorithm. The main idea here is that at each step of the 
iteration we reduce the problem with constraints to one of a non-linear 
programming without contraints. An advantage with this method is that 
we do not use any regularity properties (i.e. existence of derivatives) of 
the functionals involved. 
Let V = W and let 

Ji : R r -> R,i - l,-- ,k, 

be continuous concave functionals (i.e. are convex functionals). We 
define a set U by 

U = {v|veR r , /,-(v) > for all i = 1, • • • , it}. 

Since are convex as in the previous section we see immediatly 
that U is a convex set. 

Suppose given a functional / c : IR r — » R satisfying: 

(1) J is continuous, 

(2) J is strictly convex and 



(3) 7 (v) — » +oo as ||v|| — » +oo. 
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We consider the following 

Problem 2.1. To find ueU such that 

J (u) < J (v) for all veil. 

As usual, in view of the hypothesis (3) on J , we may without loss 
of generality assume that U is bounded. We can describe the algorithm 
as follows. 

Let u eU be an initial point, arbitrarily fixed in U. 

We shall find in our algorithm a sequence of triplets (u m , u' m , € m ) 
where for each m > 0, u m , u' m eU and i m is a sequence of real numbers 
such that i m > £ m+ i V m and l m > J (u' m ). 

We take at the beginning of the algorithm the triple (u , u' ,( ) where 

H — Ho , "^o — Jo (Wo) 

Suppose we have determined (u m ,u' m ,£ m ). To determine the next 
triplet (u m+ i, u' +v ( m +\) we proceed in the following manner. 
Consider the subset U m of U given by 

(2.1) U m = {v\VeU,J (v)<£ m }. 

Since / is convex and continuous it follows immediately that U m is 
a bounded convex closed set in W. Hence U m is a compact convex set 
in R r . 

We define a function tp m : W —> R by setting. 

k 

(2.2) (p m (v) = {l m - J (v)) Y] Ji(v). 

i=i 

The continuity of the functionals J ,J\,--- ,Ju immediatly imply 
that (p m is also a continuous function. Moreover, tp m has the properties 
of distance from the boundary of U m . i.e. 

(i) <p m (v) >0forveU m . 

(ii) (p m (v) - if v belongs to the boundary of U m . i.e. For any v on 
any one of the (k + 1) Tevel surfaces defined by the equations 
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/o(v)=£ m ,/i(v) = 0,-",/t(v) = 

we have 

fm(v) = 0. 

Now since U m is a compact convex set in W and <p m is continuous 
it attains a maximum in U m . J being strictly convex this maximum is 
unique as can easily be checked. 

We take w m+ i as the solution of the maximizing problem: 

Problem 2.2 m . u m+ ieU m such that <p m (u m+l ) > <p m (v), SveU m . 

Now suppose u' m eU m so that J {u' m ) < l m . This is true by assumption 
at the beginning of the algorithm (i.e. when m - 0). Hence <p m (u' m ) > 0. 
We take a point u' m+1 such that 

(2.3) u' m+l eU m and Jo(u' m+l ) < J (u m+ i). 

It is clear that such a point exists since we can take u' m+i - u m+ \. 
However we shall choose u m+ \ as follows: Consider the line A(u' m , u m +\) 
joining u' m and u m +\. We take for u' m+1 the point in U m such that 

(24) | u' m+l eA(u' m .u m+l ) n dU m , 

\ and Jo(u' m+1 ) < J {u m+ i). 

Now we have onlyu to choose l m+ \. For this, let r m be a sequence 
of real numbers such that 

(2.5) < a < r m < 1, where a > is a fixed constant. 

We fix such a sequence arbitrarily in the beginning of the algorithm. 
We define C m+ \ by 

(2-6) ^m+l — ~ r m(^m ~ Jo(u' m+ i)). 

It is clear that € m+ \ < l m and that £ m+ \ > Jo(u' m+l ). Thus we can 
state our algotrithm as follows: 

Algorithm. Let u eU be an arbitrarily fixed initial point. We deter- 
mine a sequence of triplets (u m ,u' m ,£ m ) starting from (u ,u ,J (u )) as 
follows: Let (u m , u' m , £ m ) be given. Than (u m+ i,u' m+l J m+ i) is given by 
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(a) u m+ ieU m is the unique solution of the Problem 

(b) u' m+l eU m is given by $TQ. 

(c) £ m+ i is determined by A2.6I ). 

Once again we can prove the convergence of this algorithm. 

Remark 2.1. The maximization problem IX21„ at each step of the itera- 
tion is a non-linear programming problem without constraints. For the 
soultion of such a problem we can use any of the algorithms described 
in Chapter|21 

Remark 2.2. Since the function ip m which is maximized at each step has 
the properties of a distance function from the boundary of the domian 

o 

U m and is > in U m , <p m > in U m and ip m = on U m the maximum is 

o 

attained in the interior U m of U m . This is the reason for the nomenclature 
of the algorithm as the Centre method. (See also [45]). 

Remark 2.3. The algorithm of the centre method was first given by 
Huard l25ll and it was improved later on, in particular, by Tremolieres 

113. 

Remark 2.4. This method is once again not usded for functionals J 
arising from problems for partial differential equations. 

3 Method of Gradient and Prohection 

We shall describe here a fairly simple type of algorithm for the min- 
imization probelm for a regular convex functional on a closed convex 
subset of a Hilbert space. In this method we suppose that it is easy to 
find numerically projections onto closed convex subsets. At each step 
to construct the next iterate first we use a gradient method, as developed 
in Chapter|3 for the minimization problem without constraints and then 
we project on to the given convex set. "In the dual problem" which we 
shall study in Chapter |5] it is numerically easy to compute projections 
onto closed convex subsets and hence this method will be used there 
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for a probelm for which the convex set is defined by certain constraints 
which we shall call dual constraints. 

Let K be a closed convex subset of a Hilbert space V and / : V —> R 
be a functional on V. We make the following hypothesis on K and /. 

(HI) K is a bounded closed convex set in V. 

(H2) / is regular in V: J is twice G-differentiable everywhere in V and 
has a gradient G(u) and hessian H(u) everywhere in V. Moreover, 
there exists a constant M > such that 

\\H(u)\\ < M,VueK. 

(H3) H is uniformly coercive on K: there exists a constant a > such 
that 

(H(u)<p,<p) > a\\<p\\ 2 ,V(peV and ueK. 
We note that the hypothesis of bounededness in {HI) can be replaced 

by 

(my j( v ) -» +oo as imi -» +oo. 

Then we can fix a u Q eK arbitrarily and restict our attention to the 
bounded closed convex set 

K n {v\veV\ J(v) < /(Ho)}. 

The hypothesis (H3) implies that / is strongly convex. The hypothe- 
sis (H2) implies that the gradient G{u) is uniformly Lipschitz continuous 
on K and we have 



(3.1) \\G(u) - G(v)|| < M\\u - v\\, Vm, veK. 

We now consider the problem : 
Problem 3.1. To find ueK such that J(u) < J(v), VveK. 
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Algorithm. Let u e K be an arbitrarily fixed initial point of the algo- 
rithm and let P : V ^> Kbe. the projection of V onto the bounded closed 
convex set K. 

Suppose u m is determined in the algorithm. The we define, for p > 0, 

(3.2) u m+ i = P(u m - pG{u m )). 
Then we have the following 

Theorem 3.1. Under the hypothesis {HI) - (H3) the Problem has 
a unique solution u and u m — > u as m — > +oo. 

This follows by a simple application of contraction mapping theo- 
rem. 

Proof. Consider the mapping of K into itself defined by 

(3.3) T p : Kbu^> P(u- pG(u)eK,p > 0. 

□ 

Suppose this mapping T p has a fixed point w. i.e. 

weK and satisfies w = P(w - pG(w)). 

Then we have seen that such a w is characterized as a solution of the 
variational inequality : 

(3.4) weK; (w - (w - pG(w)), v - w) > 0, VveK. 
Then (13.4b is nothing but saying that 

(3.4)' weK; (G(w), v - w) > 0, VveK. 

Then by Theorem l2.2l of Section^ ChapterEJw is a solution of the 
minimization Problem [3.1l and conversely. In other words, Problem l3~T1 
is equivalent to the following 
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Problem 3.1'. To find a fixed points of the mapping T p : K — > K. i.e. 
To find w £ K such that w = P(w - pG(w)). 

We shall now show that this Problem A3. If has a unique solution for 
p > sufficiently small. For this we show that T p is a strict contraction 
for p > sufficiently small: there exists a constant y, < y < 1 such 
that, for p > small enough, 

\\P(u - pG(u)) - P(v - pG(u))\\ < y\\u - v||, Vw, veK. 

In fact, if p > is any number then we have 

\\P(u-pG(u)) - P(v - pG(v))\\ 2 < \\{u-pG{u)) - (v -pG(v))|| 2 

since ||P|| < 1. The right hand side here is equal to 

\\u - v-p(G(u) - G(v))|| 2 = \\u - v|| 2 - 2p(G(u) - G(v), u-v) +p 2 \\G(u) - G(v)|| 2 

Here we can write by Taylor's formula 

(G(u) - G(v), u-v) - (H(u — v), u — v) 

where H = H(v + 6(u - v)) for some < 6 < 1. Since K is convex, 112 
u, veK, v + 6(u - v)eK and then by uniform coercivity of H on K (i.e by 
H3) 

(H(u -v),u-v)> a\\u - v|| 2 Vw, veK. 
This together with the Lipschitz continuity (13 .U of G gives 

\\P(u - pG(u)) - P(v - pG(v))|| 2 < \\u - v\\ z - 2pa\\u - v\\ 2 + M 2 p 2 \\u - v\\ 2 . 

= \\u- v|| 2 (l -2pa + M 2 p 2 ). 

Now if we choose p such that 

(3.5) < p < 2a/M 2 

it follows that (1 - 2pa + M 2 p 2 ) = y 2 < 1. 

Then by contraction mapping theorem applied to T p proves that 
there is a unique solution of the Problem d3.1t / . 
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Finally to show that u m —> u as m — > +00, we take such a p > 
sufficiently small i.e. p > satisfying (I3.5t . Now if u m +\ is defined 
iteratively by the algorithm (I3.2t and w is the unique solution of the 
Problem [3. II (or equivalently of the Problem ( 13. W ) then, 

\\Um+\ ~ "II = ll^(«m -pG(u m )) - P(u-pG(ll))\\ 

=< y\\u m - u\\ 

so that we get 

\\u m+ \ - u\\ < y m \\u - u\\. 

Since < y < 1 it follows immediatly from this that u m — » u as 
m — > +00. 

This proves the theorem completely. 

Now the convergence of the algorithm can be proved using the re- 
sults of Chapter^] (See Rosen 01, EOT ). 

We also remark that if V = K and hypothesis (HI)', (H2) and (H3) 
are satisfied for bounded sets of V then we get the gradirnt method of 
Chapter El 

4 Minimization in Product Spaces 

In this section we shall be concerned with the probelm of optimiza- 
tion with or without constraints by Gauss-Seidel or more generally, by 
relaxation methods. The classical Gauss-Seidel method is used for so- 
lutions of linear equations in finite dimensional spaces. The main idea 
of optimization described here is to reduce by an iterative procedure the 
problem of minimizing a functional on a product space (with or with- 
out constraints) to a sequence of minimization problems in the factor 
spaces. Thus the methods of earlier sections can be used to obtain ap- 
proximations to the solution of the problem on the product space. 

The method described here follows that of the paper of Cea and 
Glowinski [9], and generalizes earlier methods due to various authors. 

We shall given algorithms for the construction of approximating se- 
quences and prove that they converge to the solution of the optimization 
problem. One important feature is that we do not necessarily assume 
that the functionals to be minimized are G-differentiable. 
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4.1 Statement of the problem 

The optimization problem in a product space can be formulated as fol- 
lows: Let 

(i) Vi(i - 1, • • • , N) be vector spaces over R and let 

N 

v = Y\v, 

(=1 

(dim Vi are arbitrary). 

(ii) K be a convex subset of V of the form K - Ylf = i Kj where each 
Ki is a (non-empty) convex subset of Vj(i - 1, • • • ,N). Suppose 
given a functional / : V — > R. Consider the optimization prob- 
lem: 



(4.1) 



To find ueK such that 
J(u) < J(v) for all veK. 



For this problem we describe two algorithms which reduce the prob- 
lem to a sequence of N problems at each step, each of which is a min- 
imization problem successively in K{(i = 1, • • • ,A0- Let us denote a 
point veV by its coordinates as 

v- (vi,-- - ,v N ),VieVi. 

Algorithm 4.1. (Gauss-Seidel method with constraints). 

(1) Let u° = (u°, • • • , u° N ) be an arbitrary point in K. 

(2) Suppose u n eK is already determined. Then we shall determine 
w' l+1 in N steps by successively computing its components w" +1 
(i = l),---,iV. 

Assume u" +1 eKj is determined for all j < i. Then we determine 
u" +l as the solution of the minimization problem: 



(4.2) 



u1 +l eKi such that 

i 1 



< /(k» +1 , • • • , m?^ 1 , V,-, m? +1 , • • • , 4) for all VfcAT, 
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In order to simplify the writing it is convenient to introduce the fol- 
lowing notation. 

115 Notation. Denote by Kf +l (i = 1, • • • , N) the subset of K: 

(4.3) Kf y = {veK\v = « 1 uf_\,v u u n M , ■ ■ ■ , u%), v.-e^}. 
and 



(4.4) 



7? +1 = u n 



1 ' ' ' / ' "7+1' ' N 

With this notation we can write ( 14.21 ) as follows: 



(4.2)' 



Tofmd7? +1 e#; !+1 such that 
/(w* +1 ) < J(v) for all veKf +l . 



Algorithm (4.2) (Relaxation method by blocks). We introduce numbers 
Wi with <Wi < 2(i = 1, 2, • • • , AO- 

(1) Let u° sK be arbitrarily chosen. 

(2) Assume i/'eK' is known. Then u n+l eK is determined in N succes- 
sive steps as follows: Suppose u" +l eKj is determined for all j < i. 
Then u n+l is determined in two substeps: 



(4.5) 



To find u . 1 eV; such that 

. i 

,n+\ ,.n+l 



J(u^ , • • • , U-_ j , U- 2 , U- + j , • • • , Upj) 

< J(u'[ +l , aft 1 ,v u u n M ,--- , u n N ) for all v,-eV/. 



Then we define 

(4.6) wr^p^+^-o) 

where 

(4.7) P, : V; — > is the projection onto Kj with respect to a suitable 
inner product which we shall specify later. 
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Remark 4.1. The numbers w,-e(0, 2) are called parameteres of relax- 
ation. In the classical relaxation method each w, = w, a fixed number 
e(0, 2) and V/ = Hence for the classical relaxation method 



4.2 Minimization with Constraints of Convex Functionals on 
Products of Reflexive Banach Spaces 

Here we shall introduce all the necessary hypothesis on the functional J 
to be minimized. We consider J to consist of a differentiable part J and 
a non-differentiable part Ji and we make separate hypothesis on J and 
Jl. 

Let V{(i = 1, • • • ,N) be reflexive Banach spaces and V = T\f=i V{. 
The duality pairing (•, Ovxv will simply be denoted by (•, ■), then norm 
in V by || • || and the dual norm in V by || • ||». Let K{ be nonempty 
closed convex subsets of Vj and K = T\f=i Kj. Then clearly K is also a 
noneempty closed convex subset of V. 

Let J : V — > R be a functional satisfying the following hypothesis: 

(HI) 7 is G-differentiable and admits a gradient G . 

(H2) J is convex in the following sense: If, for any M > 0, Bm denotes 
the ball {veV; \\v\\ < M\, then there exists a mapping 

Tm '■ Bm x Bm — > R 
such that and (HTTUt hold: 



(4.8) 



— M ( - + W(U- — Uj ). 



(4.9) 



/o(v) > J (u) + (G (u),v - u) + T M (u,v), 

Tm(u, v) > for all m, veBm, 

Tm(u, v) > for all u, vcBm with u + v. 



(4.10) 



If (u n , v n ) n is a sequence in Bm ~xBm such that 
Tm(u„,v„) —> as n — > +oo ther 
ll M « — v„ || — > as n — > +oo. 
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Remark 4.2. If J is twice G-diffferentiable then we have 
Tm(u, v) = -J"{u + 6(y - u),v - u,v - u) for some < 8 < 1. 

Then the hypothesis ( 14.9ft and (14. 10ft can be restated in terms of 
In particular, if J admits a Hessian H and if for every M > 
there exists a constant &m > such that 

(H(u)tp, tp) > c^mIM! 2 for all <peV and ucBm 

then the two conditions d4.9t and (14. 10ft are satisfied. 

(H3) Continuity of the gradient G of J . 

If (u n , v n ) n is a sequence in Bm x Bm such that 

(4.11) < |[wn _ v n ||— » as n — > +oo then 

||G(m„) - G(v„)||* — > as n — » +oo. 

Next we consider the non-differentiable part 7i of J. Let /i : V — » 
R be a functional of the form 

N 

(4.12) Ji(y) - ^ ^(v/),v - (vi, • • • , v n )eV 

;'=l 

where the functionals 

/ M :Vi->R(/ = l,--.,JV) 
satisfy the hypothesis: 

(H4) /i f j is a weakly lower semi-continuous convex functional on V/. 
We define 

(4.13) J = J + Jl 
Finally we assume that J satisfies the hypothesis: 

(H5) J(y) — > +oo as ||v|| — > +oo. We now consider the minimization 
problem: 



(4.14) 



To find ueK such that 
J(u) < J(v) for all veK. 
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4.3 Main Results 

The main theorem of this section can now be stated as: 

Theorem 4.1. Under the hypothesis (HI), • • • ,(H5) we have the fol- 
lowing: 

(1) The problem A4. 141 has a unique solution ueK and the unique 
soultion is characterized by 



(2) The sequence u n determined by the algorithm i4.lt converges 
strongly to u in V. 

Proof. We shall divide the proof into several steps. 

Step 1. (Proof of (1)). The first part of the theorem is an immediate 
consequence of the Theorem (11.11) and (12.31) of Chapter |2 In fact, K is 
a closed non-empty convex subset of a reflexive Banach space V. By 
Hypothesis (H2), J is strictly convex since, for any v, ueV, we have 



and hence strictly convex, while Ji(v) is convex so that for any vi,v 2 eV 119 
and 8e[0, 1] we have 

/(6>vi + (1 - 0)v 2 ) = MOvi + (1 - 0)v 2 ) + h(Ov x + (1 - G)v 2 ) 



Next / is weakly lower semi-continuous in V: In fact, since J has 
a gradient G the mapping 



ueK such that 



(4.15) 



G {u), v - u) + 7i(v) - J\(u) > Ofor all veK. 



Jo(v) > J (u) + (G (u), v - u) + T M (v, u) 
> J (u) + (G (u), v - u) if v ^ u, 



< 6J Q (vi) + (1 - G)J (v 2 ) + 0/i(vi) + (1 - 0)/i(v2) 
= 0/(vi) + (1 - 6)J(v 2 ). 



(f i > J' (u,ip) = (G (u),<p) 
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is continuous linear and hence, by Proposition 14.11 of Chapter ^ Jo is 
weakly lower semi-continuous. On the other hand, by (H4) J\ is weakly 
lower semi-continuous which proves the assertion. Then Theorem (11.11) 
of Chapter^implies that states that u is characterized by ( 14.151 . 

We have therefore onlu to prove (2) of the statement. We shall prove 
the convergence of the algorithm in the following sequence of steps. 

Step 2. At each stage of the algorithm the subproblem of determining 
5? +1 has a solution. In fact K" +l is againd a non-empty closed convex 
subset of V. Moreover, again as in Step^ J satisfies all the hypothesis 
of Theorem (11.lt of Chapter |2 and (I2.3t of Chapter Hence there 
exists a unique solution of the problem (I4.14t and this soution is 
characterized by 



/i(v) - = £(/ij(v 7 ) - ./,./(»;.;;. ; » = / M oo - JiM +1 )- 



Step 3. J(u n ) is decresing. We know that v^eKf 1 for i = 1, • • • ,N 



(4.16) 




since 



N 
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and on taking v - 5ft 1 in (4.2/ we get 



/(5ft 1 ) < /(5ft 1 ). 



using this successively we find that 



/(5ft 1 ) < /(5ft 1 ) < < /(5ft 1 ) = J(u n ) 



and similarly 



/( M ' 1+1 ) = /«- 1 )<-.-</(^ +1 ). 



These two togrther imply that 



J(u n+i ) < J(u n ) ofr all n = 0, 1, 2, • • • 
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which proves that the sequence J(u n ) is decreasing. In particular it is 
bounded above: 

J(u n ) < J(u°) ofrallrc > 1. 

Since ueK is the unique absolute minimum for J given by Step Q 
we have 

J{u) < J(u n ) < J(u°) for all n > 1. 

On the other hand, by Hypothesis (H5) we see that ||m"||, 1| form 
bounded sequences. Thus there exists a constant M > such that 

(4.17) \\u n \\ + + ||u|| < Mfor all n > 1 and all 1 < i < N. 
Since 

J(u) < J(u n+l ) < J(u n ) 

it also follows that 

(4.18) J(u n ) - J{u n+l ) Oasrc +oo. 

Step 4. We shall that u" - u n+l — » as n — > +oo. For this, by the- 
convexity hypothesis (H2) of J applied to u = ~ff} +i and v = we 
get 

/o^ 1 ) ^ -/o(^ +1 ) + (G (^ +1 ),^ + / -T^ 1+1 ) + T M ^\K\) 
where M > is determined by (14. 17b in Step ©. From this we find 
/«-/) > 7(^ +1 ) + [(Go^ 1 ),^ 1 -^ +1 ) + /i^/) - 

+ r M (^ 1+1 ,^ + 1 1 )- 

Here by the characterization (14.161) of ~ff! +l eK n+[ as the solution sub- 
problem we see that the terms in the brackets [• • • ] > and hence 

J&i-i ) > J(% +1 ) + T M @l + \T%ll) for all i = l,--- ,N. 

Adding there inequalities for i - 1, • • • , N we obtain 

/CC 1 ) = J(u n ) > J$% 1 ) + ^ T M ^\Kb 
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that is, 



= J{u n+1 ) + Y j T M ($ + \ml), 

i 

/(«") - J(u n+1 ) > J] TmQT^^I). 



Here the left side tends to as n — > oo and each term in the sum on 
the right side is non-negative by ( I4.9t of Hypothesis (H2) so that 

T M (j% +1 , h£!i ) -> as n -> +oo for all i = 1, • • • , N. 



In view of ( 14. 101 ) of Hypothesis (//2) it follows that 
(4.19) 



,n+l 



-> as n — > +oo for all / = 1 , • • • , and 
as n — > +oo 



which proves the required assertion. 

Step 5. Convergence of the algorithm. Using the convexity Hypothe- 
sis (H2) of J with u and v interchanged we get 

/o(v) > J {u) + (G (u), v - u) + T M (u, v) 
J (u) > J (v) + (G (v), v - u) + T M (v, u) 



which on adding give 
(4.20) 



(G (v) - G (u), v - u) > Rm(v, u) 
where 

R M (v, u) = T M (u, v) + T M (v, u). 



Taking for u the unique solution of the problem ( 14.141 ) and v = u 
we obtain 

(G (u n+l ) - G (u), u n+i -u)> R M {u, u n+l ) 



n+l 



from which we get 

(G (u n+[ ),u n+l 



(4.21) 



m) + 7 1 (m' ,+i )-/i(m) 



> [(G (u), w 



n+l 



u) + J r (u n+1 ) - Ji(u)} + R M (u, u n+l ) 



n+l\ 



>R M {u,u n+l ) 
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since u is characterized by d4.15t . Introducting the notation 
w 1 +l =7^ +1 + ((),•■• ,0,ui-uf\0,--- ,0) 

we have 

[ w n+l = {u n . + \ ■■■ , u nJr } , m, u",---, u"JeK" +l 
(4 22 s ) < 1 l ~ ' 

\ = (u-u n+l ). 

Now we use the fact that /i(v) = Jij(vi) to get 

h(u n+l ) - - ^C/i.^r 1 ) - /!,,(«;)), 

which is the same as 

(4.23) - Mu) = £c/i(^ +1 ) - 

i 

Substituting (l4~22t and (l4~23l in we have 

>R M (u,u n+1 ). 
This can be rewritten as 

JjG {u n+l ) - G (^ +1 ),^ +1 - < +1 ) 

i 

^[(G (^ +1 ),< +1 + /i« +1 ) - /i(5? +1 )] +R M (u,u n+l ). 

i 

But again by the characterization (I4.16t of the solution H" +1 eK n+1 of 
the sub-problem ( I4.14t the terms in the square brackets and hence their 
sum is non negative (to see this we take v = eK? ). Thus 

(4.24) -Go@! +l ),T?! +l -wf l )>R M {u,u n+l ). 

i 

Here we have 

- < +1 || y = \\ Ui - u» +l \\ Vi < INI + |[^ +1 || < M. 
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By Cauchy-Schwarz inequality we have 

\(G (u n+1 ) - G (j% +1 ),l% +1 - < +1 )l ^ M\\G (u" +1 ) - GoGT 1 )!!*. 
Now since 

N 

\\u n+l -i?> +1 \\ = < 2 -^ii 

124 which tends to by ( 14. 191 ) and since G satisfies the continuity hypoth- 
esis PTTTt of (#3) it follows that 

Rm{u, u n+i ) — > as n — > oo. 

This by the definition of Rm(u, v) implies that 

Tm(u, u n+1 ) — » as « — > oo. 

Finally, by the property ( 14. 101 ) to T^f (u, v) in Hypothesis (//2) we 
conclude that 

\\u - u n+l \\ — > as « — > oo. 

This completes the proof of the theorem. □ 

Remark 4.3. If the convex set K is bounded then the Hypothesis (H5) 
is superfluous since the existence of the constant M > in ( 14.171 is then 
automatically assured since u, u n ,u^ +l eK for all n > 1 and i - 1, • • • , N. 

4.4 Some Applications : Differentiable and Non-Differatiable 
Functionals in Finite Dimensions 

We shall conclude this section with a few examples as applications of 
our main result (Theorem 14. U without going into the details of the 
proofs. To begin with have the following: 

Theorem 4.2. ( Case of differentaible functionals on the finite dimen- 
sional spaces). 

Let J : V - R p — > R be afunctional satisfying the Hypothesis: 
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(Kl) JoeC^RP,^) 

( K2 ) J is strictly convex 

(K3) J (v) — > +00 as \\v\\ -> +00. 

Then the assertion of the Theorem ( I4.il ) hold with J - J . 

It is immediate that the Hypothesis (HI) and (H3) are satisfied. 
Since /] = 0, (HA) and (H5) are also satisfied. There remains only 
to prove that the Hypothesis (H2) of the convexity of J holds. For a 125 
proof of this we refer to the paper of Cea and Glowinski [9]. (See also 
Glowinski JH, IT9K 

N 

Remark 4.4. Suppose /? = £ /j,- be a partition of p. Then in the above 

i 

N 

theorem we can take Vi = R Pi so that V = Yl Vi- We also have the 

Theorem 4.3. ( Case of non-differentiable functions on finite dimen- 
sional spaces - Cea and Glowinski). Let Vi - W*(i - 1, • • • ,N) and 

N 

V - R p (p - 2Z pi). Suppose J '■ V — > R satisfies the hypothesis (Kl), 

i=l 

(K2) and (K3) pf Theorem ( 14.21 1 above and J\ : V ^ Rbe another func- 

N 

tional of the form J\(v) =2-^1 i( v i) where the functional Ji : Vi — > R 

i=l 

satisfy the Hypothesis below: 

(K4)J\ j is a non-negative, convex and continuous functional on 
RPt = Vi(i '= I,-- - ,N). 

Then the functional 

J = Jo + J 1 

satisfies all the Hypothesis of Theorem ( 14. Il l and hence the algorithm 
(14.11) is (strongly) convergent in V = R p . 

We shall now give a few examples of functional J\ which satisfy 
(K4). 

Example 4.1. We take Jij(vi) = aj\£i(vi)\ where 
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(i) at > are fixed numbers 

(ii) (i : V{ = R Pi — > R is a continuous linear functional for each 
i = l,-- ,AT. 

In particular, if = l(i = 1, • • • , AO and hence p = N we can take 
Ji,i(vd = ai\vi\, 



This case was treated earlier by Auslander [ 53 ] who proved that the 
algorithm for u" converges to the solution of the minimization problem 
in this case. 

Example 4.2. We take 



where 

(i) a-i > are fixed numbers, 

(ii) t\ : Vj — > R are continuous linear forms on R. Pi , and we have used 
the standard notation: 



and 



N 




Wit = 



i 



4( Vi ) when ^(Vi) > 
when ^(v,) < 0. 



Example 4.3. We take 



■/i,i(v;) = arillvillR.fi 



where 



( P' V 



l|V/||RPi = I 1 "''./ 
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4.5 Minimization of Quadratic Functionals on Hilbert 
Spaces-Relaxation Method by Blocks 

Here we shall be concerned with the problem of minimization of quadra- 
tic funcitonals on convex subsets of a product of Hilbert spaces. This 
is one of the most used methods for problems associated with partial 
differential equations. We shall describe an algorithm and prove the 
convergence of the approximations (obtained by this algorithm) to the 
solution of the minimization problem under consideration. 

Statement of the problem. Let Vj(i - 1,2, • • • N) be Hilbert spaces, 
the inner products and the norms are respectively denoted by ((•)); and 
|| • ||j. On the product space we define the natural inner product and norm 127 
by 



(4.25) 



((«,v)) 



N 

E ((««> v,-))i, 
i=i 



N 

E Ik 

V=i 

U = («!,••• ,U n ),V = (Vi,- 



,v n )eV, 



for which V becomes a Hilbert space. Let K be a closed convex subset 
of V of the form 



(4.26) 



K = n£i Kt where 



Kf is a closed convex nonempty subset of Vj(l < i < N). 
Let J : V — » R be a functional of the form 



(4.27) 



/(v) = i«(v,v)-L(v) 



where «(-, •) is a bilinear, symmetric, bicontinuous, V-coercive form on 
V: 

There exist constants M > and or > such that 



(4.28) 



\a(u, v)\ < M||«||vl|v||v for all u, veV, 
a(u,u) > a\\u\\y for all ueV, and 
a(u, v) - a(v, u) 
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Moreover, L : V — » R is a continuous linear functional on V. Con- 
sider the optimization problem : 



(4.29) 



To find ueK such that 
J(u) < J(v) for all veK. 



Then we know by Theorem B.ll of Chapter 0that under the assump- 
tions made on V, K and J the optimization problem (I4.29t has a unique 
solution whihc is characterized by the variational inequality 



(4.30) 



ueK. 

a(u, v - u) - L(v - u) > for all veK. 



4.6 Algorithm (4.2) of the Relaxation Method - Details 



In order to give an algorithm for the solution of the problem (I4.29t we 
obtain the following in view of the product Hilbert space structure of V. 
First of all, we observe that the bilinear form a(-, •) give rise to bilinear 
forms 



V;XV 



j 



N 



(4.31) 
such that 

a{u,v) - ^ aij(Vi,Vj). 

In fact, for any v,eV,- if we set V to be the element of V having 
components (v')j = for j ± 1 and (v')/ - v,-, we define 

(4.33) a i7 (v ; ,v y ) = a(v ! ',v^). 

It is the clear that the properties ( 14.281 ) of a(-.-) immediately imply 
the following properties of a,y(-, •): 

citj is bicontinuous :|a i7 (v;, vj)\ < M||v,-yvj||j. 

aij{vi,Vj) = aji(vj,Vi) 

an is Vi - coercive : ^(v,-, v,-) = a(v ! , v l ) > a\\v l 
for all vi € Vu vjeVj 



(4.34) 



J " 2 - a\\vl 
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Using the bicontinuity of the bilinear forms a !; (-, •) together with 
Riesz-representation theorem, we can find 

A f yeJSf (V/, Vj) suich that 
(4-35) aij(vi, v j) = (AijVi, v jh'xVj 

where (-,-)v'xv denotes the duality pairinig between Vj and its dual 

V- (which is canonically isomorphic to Vj). The properties d4.34t can 
j ~ 

equivalently be stated in the following form: 



(4.34)' 



\\Aij\\sr<y h Vj) ^ M > 

Ajj = A*j,An are self adjoint 

(A u Vi,Vi) v > xV . > a\\vi\\f for all v ; eV/. 



By lax-Milgram lemma An are invertible and A7. l eJ£(Vi, Vj). 
In a similar way, we find the forms L defines continuous linear func- 
tional Lj : Vj — » R such that 



I Lj(vj) = L(v') for all v,-eV, 
\ L(v) = 2£j Lj(vi) for all veV. 

Again by Riesz-representation theorem there exist FjeVj such that 

L l (v l ) = ((F,-,v l )) I forall v,eVj 

so that we can write 

(4.36) ^) = 2((F,,Vi))i. 

i=i 

As an immediate consequence of the properties of the bilinear forms 
an(-, •) on Vj we can introduce a new inner product on Vj by 

(4.37) [uj,Vj] Vi = au(ui,Vi). 

which defines an equivalent norm which we shall denote by ||| • |||,- (we 
can use Lax-Milgram lemma) on Vj. 
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We shall denote by Pi the projection of Vj onto the closed convex 
subset Kj with respect to the inner product [•, •],-. 

We are now in a position to describe the algorithm for the relaxation 130 
method with projection. (See also [ 19]). 

Algorithm 4.2. - Relaxation with Projection by Blocks. 

Let Wj(i = 1, • • • , AO be a fixed set of real numbers such that < 
wi < 2. 

(1) Let u° = (u° v ■•■ , u° N )eK be arbitrary. 

(2) Suppose u n eK is already determined. We determine u n+1 eK in N 
successive steps as follows: Suppose, u" +l eK are already found 
for j < i. 

Then we take 

uf = Piu'J - wtA^Z A ijU " +l + 2 Aiju" - F t )) 
(4.38) { J<i & 

i =!,-■■ ,N. 

Remark 4.5. In applications, the boundary value problems associated 
with elliptic partial differential operators will be set in appropriate 
Sobolev spaces H m (Q.) on some (bounded) open set Q. in Euclidean 
space. After discretization (say, by suitable finite elemnt approxima- 
tions) we are led to problems in finite dimensional subspaces of H m (£l) 
which increase to H m (£l). In such a discretization An and A,y will be 



matrices with the properties (4.34)' described above 



4.7 Convergence of the Algorithm 

As usual we shall prove that the algorithm converges to the solution of 
the minimization problem ( 14.291 1 in a sequence of steps in the following. 
We shall begin with 

Step 1. J(u n ) is a decreasing sequence. For this we write 



(4.39) J(u n ) - J{u n+l ) - JdC 1 ) ~ J(^n 1 ) 
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N 



= 2(/(^_ + /)-^ +1 )) 



1=1 



and show that each term in tha last sum is non-negqtive. We observe 
here that 



(4.40) 



77 !+1 - (V l+1 ••• m" +1 u n u" ••• u n ) 

771+1 _ /",,«+! 7y n+ l ;/ n+ l ;/ n n n \ 

i ~ \ u \ ' ' ' ' ' ;-i ' u t ' ;+i ' ' ' ' ' ^A^- 



Setting, for each i = 1, • • • ,7Y, 



(4.41) { /• ' /•' 



ii*i(Vi) - jCAfiV,-, vd - (gi, v,-) 



we immediately see that 



(4.42) /or/) - /or 1 ) - ;,«) - mu^ 1 ). 

Hence it is enough to show that the right hand side of (14.42b is non- 
negative. In fact, we shall prove the following 

Proposition 4.1. For each i, 1 < i < N, we have 

(4.43) - > 2 -^\K - < +1 in- 

The proof will be based on some simple lemmas: 

Step 2. Two lemmas. Let H be a Hilbert space and C be a non-empty 
closed convex subset of H. Consider a quadratic functional j : H — > R 
of the form 

(4.44) j(v) = ^(v,v)-(g,v) 



where 
(4.45) 



&(•, •) is a symmetric, bicontinuous, //-coercive 
bilinear form on H and geH. 
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Then we know by Theorem 13. II of Chapter |2 that the minimization 
problem 



(4.46) 



To find ueC such that 
j(u) < j(v) for all veC 



132 has a unique solution. On the other hand, the hypothesis on b(-, •) imply 
that we can write 

b(u, v) = v(Bv) for all u, veH 
and 

BeJ£{H, H), B = B* exists and belongs to (H, H) 
Moreover, 

(4.48) [u, v] - b(u, v) - (m, Bv) 
defines an inner product on H such that 

(4.49) «h« = [«,«]5 
is an equivalent norm in H. Then we have the 



Lemma 4.1. If ueC is the unique solution of the problem \4.46\ and 
if P : H — > C denotes the projection onto C with respect to the inner 
product [•, •] then 



(4.50) 



u = P{B~ l g). 



Proof. We also know that the solution of the problem (I4.46t is charac- 
terized by the variational inequality 



(4.51) 



ueC, 

b(u, v - u) > (g,v - u) for all veC. 



□ 
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Since we can write 
(4.52) (g, v - u) = {BB~ l g.v — u) — b{B' l g, v - u) 

this variational inequality can be rewritten in the form 



(4.51)' 



ueC, 

[u - B~ l g, v-u\ = b{u- B~ l g, v - u) > for all veC. 



But it is a well known fact that this new variational inequality char- 
acterizes the projection P{B~ l g) with respect to the inner product [■, •] 
(For a proof, see for instance Stampacchia [44]). 

Lemma 4.2. Let u eC. lfu\ is defined by 

(4.53) wi = P{u +w(B- 1 g-u )),w > 0. 

where P is the projection H — > C with respect to [•, •] then 



|2 



2 — w 

(4.54) j(u ) - j(ui) > — — |||mo - "i 

2w 

Proof. If vi , v 2 eH then we have 

j(yi) ~ Kn) = -Ab{v\ , v0 - b(y 2 , v 2 )} - l(g, vi) - (g, v 2 )} 

- -{b(v u vi) - b{v 2 ,v 2 )} - {BB~ l g,vi - v 2 ) 
= ^[b(v\ , vi) - b(y 2 , v 2 )} - b(B~ l g, vi - v 2 ) 

- i{6(vi - 5" V, vi - - fc(v 2 - S" 1 ^ v 2 - 
= |(|[|vi -B-^IH 2 - |||v 2 -S-Vlll 2 )- 



Since we can write 

u\ - B~ l g - (u - B~ l g) + {u\ - Wo) 
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we find 

(4.55) IIK-fi-^lll 2 = Hlwi -S- Vlll 2 - - Wolll 2 + [Mo Ml - Mo ] 
But on the other hand, by definition of u\ as the projection it follows 

that 

[u + w{B~ 1 g - Mo) _ u\ , u - U\ ] < 

and hence 

|||wo - Mi < w[u - B~ l g, U - Ml]. 



Substituting this in the above identity ( 14.551 ) we get 

|||Mo - B-^IH 2 - |||Ml - B~ l g\\\ 2 > (2 - W)[M - B~ l g, M - Ml] 

2_H, ,i, „,2 
> ~Z IIIMo - "llll , 

2w 

which is precisely the required estimate A4.54I ). 

Step 3. Proof of the Proposition (4.1). It is enough to take 

H = V it C = K h b(; •) = a a (; •), P = P, ■ = Proj{Vi -> K t ] 

and 

,,n _ „ „n+l _ „ 
U- — U ,U^ — Mi 

in Lemma l4~2l 

Corollary 4.1. We have, for each n > 0, 

N ? - 

(4.56) J(u n ) - 7(m" +1 ) > g ^Hl"-" 1 - <HI* 

Proposition 4.2. I/O < Wj < 2 for alii - 1, • • • ,N then 

[ J{u n ) > J{u n+1 )for all n and 

(4.57) { , 

I u" — w — » strongly in V as n — > oo. 
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Proof. The fact that J{u n ) is a decreasing sequence follows immediately 
from the Corollary (14. U . Moreover, J{u n ) > J(u). for all n, where u is 
the (unique) absolute minimum of J in K. Hence, 

J{u") - J(u n+l ) -» as n -> oo. 

□ 

Once again using the Corollary (14.11) and the fact that 2 - w,- > for 
each / it follows that 

IIK 1+1 - b?|||/ -» as n -» oo. 

135 

Since ||| • ||, and || • ||,- are equivalent norms on Vi we find that 
||< -«" +1 ||/^0asn->oo 

and therefore 

ii M "-«" +i n=(2iK-< +i n?r^o 

which proves the assertion. 

Step 4. Convergence of u". We hve the following result. 

Theorem 4.4. I/O < w, < 2 for all i — 1, • • • ,N and ifu n is the sequence 
defined by the Algotihm A4.2t then 

(4.58) u" — > u strongly in V. 

Proof. By V-coercivity of the bilinear form a(-, •) we have 

a\\u n+1 - u\\ 2 < a(u n+1 - u, u n+l - u) 

= a(u n+ \u n+l -u)-(f,u n+i -u) 
- {a(u, u n+l - u) - (f, - u)\. 

□ 
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Here u n+l - ueK and u is characterized by the variational inequality 
(EDt so that 



and we obtain 
(4.59) 



a(u, u n+l -u)- (f, u n+1 -u)>0 



a\\u n+l - u\\ 2 < a(u n+l ,u n+1 - u) - (f,u n+l - u), 



We can also wirte ( 14. 591 ) in terms of the operators A,y as 
(4.59)' a\\u n+l - u\\ 2 < Ai i u T " fu ^ " Ui))i ' 



i j 



Consider the minimization problem 



(4.60) 



' u" +i eKj such that 

) < ji(vj) for all vteKj where 



{ hivi) = J(u'l +i , • • • , u^l , Vi , u n M , ■ ■ ■ , 4). 

We notice that the definition of the functional v,- h> j,(v;) coincides 
with the definition ( I4.41I) . The unique solution of the problem (14.601) 
(which exists by Theorem B.ll of ChapterEJi is characterized (in view of 
the Lemma ( 14.1ft ) by 

(4.6D if 1 = PiAfgd = PiiA^i/i ~Y A 'J u T l - Z A y«"» 



or equivalent by the variational inequality: 



{Amf X - gu vt - uf l ) > for all Vi eK t 



This is, we have 
(4.62) 



(AuTtJ +1 + Z A ijU f l + 2 A -«" - Vi - iO ^ for ^ v,e*j 

7<1 1 j>' 1 



u" +l eKi. 
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We can now write the right hand side of (4.59) / as a sum 
(4.59)" h+I 2 + h + U 

where 
(4.63) 

h = Z((A fi (n« +1 - if 1 ). 

i 

h = Z((Z A i7M '; +1 +A fi «? +1 + z Ay««-/,,«« +1 - if 1 )),-, 
/ 4 = 2((Z A,y; +1 + A ti u1 +1 + E A/ ; ^ - y},s? + 1 - «,))*. 



First of all, (by 14. 621) . I4 < and hence 

(4.64) -k|| 2 </i +/ 2 + / 3 . 

We shall estimate each one of l\,h,h as follows: Since AyeJSf 
(V,-, Vy) we set 

(4.65) Afi - max ||A i7 ||^ (y . v ) 

We also know that ||w"||, ||w"|| and hence ||n"||, ||w"|| are bounded se- 
quences. For otherwise, ji(u n ) and ji(u") would tend to +00 as n — > 00. 
But we know that they are bounded above by J(u°). So let 

(4.66) M 2 - max (sup sup \\u'J\\). 

l<i<N n n 

The, by Cauchy-Schwarz inequality, we get 

i/ii < £ - m^k^ \\Aii\\% (Vi ,vM +1 -"" +1 n 2 )^ 

i i 

= M l (M 2 + \\u\\)\\u n+l -u n+1 \\ 
and similarly we have 



\h\<M { {M 2 + \\u\\)\\u n+l -u n+l \ 
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\h\ < M x {Mi + \\f\\)\\u n+l -u n +X 
These estimates together with ( 14.641 ) give 

(4.67) a\\u" +l - u\\ 2 < 3M 1 (M 2 + \\u\\ + ||/||)||w" +1 - 
and hence it is enough to prove that 

(4.68) ||w" +1 -u n+l \\ Oasrc -» oo. 

For this purpose, since w, > we can multiply the variational in- 
equality (I4.62t by M>i and then we can rewrite it as 
(4.62)' 

((A^r 1 - - A ijU f l + A u uf + J] Aiju" ~ fi)l vj - u" +1 )) > 0. 

j<i j>' 

Once again using the fact that this variational inequality character- 
izes the projection Pj : V) — > K[ we see that 

(4.69) uf = P,{(1 - w,-K +1 - AZ\J^ A ijU f l + J AyiiJ - fi)}. 
By d4.38t we also have 

uf l = p t {{\ - wiX -a^C^a^ 1 + ^\\ /; /,'; -./;-)}. 

Substracting one from the other and using the fact that the projection 
are contractions we obtain 

(4.70) ipr 1 - on. ^ n - ^-iipr 1 - ^ m +1 - «?m.- 

since < w,- < 2 if and only if < |1 - w,| < 1. Now by triangle 
inequality we have 

iiK-^iii^iiK-^nii-pr 1 -^ 1 ^ 

^(l-li-w.Dllinr 1 -""!!!; 

> (l - ii - wiDinwr 1 - "r'liu- 
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But here, by d4.57t . we know that 

|||w?-M- +1 |||;->0asn-»oo. 
and since 1 - |1 - Wi\ > it follwos that 

ll|5? +1 -«? +1 lll(-»0 
which is the required assertion. 

Remark 4.6. The Theorem (I4.4t above on convergence of the relaxation 
method generalizes a result of Cryer [10] and of a classical result of 
Varge [50] in finite dimensional case but withour constraints. 

Remark 4.7. In this section we have introduced the parameters w; of 
relaxation. The algorithm described is said to be of over relaxation type 
(resp. relaxation, or under relaxation) with projection when w, > 1 
(resp. Wj = 1 or < w, < 1) for all i = 1, • • • , N. 



4.8 Some Examples - Relaxation Method in Finite 
Dimensional Spaces 

Let Vi = R(i = 1, ■ • ■ , AO and V = JJ^i V t = R N . Let A be a symmetric, 
positive definite (n x n) -matrix such that there is a constant a > with 

(4.71) (Av, v\n > arlMl^v for all veR w . 
Consider the quadratic functional J : M. N — » R of the form 

(4.72) /(v) = j(Av, v) KN - (J, v) KN ,feR N . 

We consider the optimization probel for J. 
Example 4.4. (Optimization without constraints). 

[ To find ueR N such that 
(4 73) < 

/(h) < J(y) for all veR N 
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If we write the matrix A as A = (a,-y) then 

j N N 

(4.74) /(v) = - 2_j a ij v J v i ~ ? iVu v = " * " > v w) e 



i,;=i i=i 
We rind then that the components of grad j are 

N 

(gradJ(v))i = (Av - /),■ = ayv,- - fi), i-\,---,N. 

7=1 

If weR^ is the (unique) solution of <!4.73b then grad J{u) = 0. That 

is, 

| « = («!•••, k„) 

\ Zf=i = fi,i =!,■■■ ,N. 

To describe the algorithm (if we take w; = 1 for all / = 1 , • • • , N) to 
construct from m* we find u k+l as the solution of the equation 

J] <3;y«5 +1 + a U U k+l + ^ flyllj = 

Since a,,- > a > we have 
(4.75) = a^ 1 [fi - J] a !7 "? +1 " 2 "0"*]> 

and thus we obtain the algorithm of the classical Gauss-Seidel methods 
in finite dimensional spaces. 

More generally, introducting a parameter w(0 < w < 2) of relaxation 
we obtain the following algorithm: 



(4.76) 



a ti [fi - E ay«7" - X atjUj] 



u k . +y - u k - w{u^ 2 - u k ) 



Example 4.5. (Optimization with constraints in finite dimensional 
spaces). 
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Let Vi, V and J be as in Exampl d4.4t . We take for the convex set K 
the following set: Let I Q ,I\ be a partition of the set {1, 2, • • • ,N}. That 
is 

7 n/i = 0and{l,2,--- ,JV} =/ U/i. 



Define 
(4.77) 
and hence 



= {v,-eR; Vi > 0} for all iel and 
= R for all ieh 



(4.78) £ - {veR N ; v = (vi, • • • , v N ) such that v; > for ieh 



Ml 



As in the previous case, suppose u are known, Assume that Uj 
are found for all j < i. We find in there substeps as follows: We 141 
define u K+l ^ 3 as the unique solution of the linear equation obtained by 
requiring the gradient to vanish at the minimum : more precisely, 

(4.79) i/ +1/3 = a, 1 [fi - J] a ijU ) +l - £ a u u% 

The we set 



(4.80) 



4 +1 =Pi(i4 +2/3 ) 



where Pi is the projection of V, onto Kj with respect to the inner product 

[Mj.Vj] = a u (ui,Vi) = a u UiV - i. 



Since a,-; > and A", are defined by (14.741) P, coincides with the 
projection of V ; onto K\ with respect to the standard inner product on R. 
Hence we have 



(4.81) +2/3 ) 



if i/ +2/3 < and ieh 
u k+2 ^ in all other cases. 
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Example 4.6. Let V = R N = R 1 x R N ~ l , K = K { x K 2 with K\ = R 1 and 

Z 2 = {v6R JV - 1 ;g(v)^0}, 

where g : R w_1 — > R is a given smooth functional on R^ -1 . Let 
/ : V — > R be a functional of the form d4.74t . We can use again an 
algorithm of the above type. In order to give an algorithm for the con- 
struction of the projection P 2 of V = R N ~ l onto K 2 we can use any one 
of the standard methods described in earlier section as, for instance, the 
method of descent. 

4.9 Example in Infinite Dimensional Hilbert Spaces - Opti- 
mization with Constraints in Sobolev Spaces 

We shall only mention briefly a few examples, without going into any 
details, of optimization problems in the typical saces of infinite dimen- 
sions which are of interest to linear partial differential equation, namely 
the Sobolev spaces H l (Q), Hl(Q.) which occur naturally in various vari- 
ational elliptic problems of second order. 

Example 4.7. Let O be a bounded open set in R" with smoth boundary 

r. 

Consider the closed convex subset K in H l (Q.) given by 
(4.82) K = {v; veH 1 (Q), y v > a. e. on T(, 

and the quadratic functional J : H l (Q.) — > R defined by 



(4.83) 



Jo(y) - jIM&ko) ~ (f'^LHny 



Then we have the optimization problem 



(4.84) 



To find ueK such that 
J (u) < J (y) for all veK Q 



Usually we use the method of over relaxation for this problem. 
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Example 4.8. Let O be a simply connected bounded open set in the 
plane K 2 . 
Consider 

(4.85) Ki = {veH l (Q.); |grad v(x)\ < 1 a. e. in Q} and 



(4.86) 



J(v) = 5 J n \gradv\ 2 dx - C f Q vdx 
where C is a constant > 0. 



The existence and uniqueness of the solution to the minimization 
problem: 



(4.87) 



To find ueK\ such that 
J(u) < J{v) for all veKi 



is classical and its properties have been studied in the paper of Brezis 143 
and Stampacchia [4] and some others. It was also shown by Brezis and 
Sibony [2| that the solution of ( 14.871 1 is also the solution of the problem 



(4.88) 



To ring ueK 2 such that 

J{u) < J(v) for all veK2, where 

K 2 - {veHl(Q); \v(x)\ < d(x,T) a.e. in Q), 

d(x, T) being the distance of x e Q. to the boundary T of Q. 



The method of relaxation described earlier has been used to solve the 
problem ( 14.881 1 numerically by Cea and Glowinski [8, 9|. We also re- 
mark that the problem (14.87ft is a problem of elasto-palsticity where Q. 
denotes the cross section of a cylindrical bar whose boundary is T and 
which is made of an elastic material which is perfectly plastic. For de- 
tails of the numerical analysis of this probel we refer the reader to the 
paper of Cea and Glowinski quoted above. 



Chapter 5 

Duality and Its Applications 



We shall introduce in this chapter another method to solve the problem 144 
of minimization with constraints of functionals / c on a Hilbert space V. 
This method in turn permits us to construct new algorithm for finding 
minimizing sequences to the solution of our problem. In this chapter we 
shall refer to the minimization problem: 

(P) To find ueU, J Q (u) = inf / D (v) 

veU 

where the constraints are imposed by the set U as the "Primal problem". 
In the previous chapter U was defined by means of a finite number of 
functionals Ji , ■ ■ ■ , /j. on V : 

U = {v\veV;Ji(v)<0,i= l,-- ,k}. 

The main idea of the method used in this chapter can be described 
as follows: We shall describe the condition that an element v belongs to 
the constraint set U by means of an inequality condition for a suitable 
functional of two arguments. For this purpose, we introduce a cone 
A in a suitable topological vector space and a functional y> on V x A 
in such a way that <p(v,/j.) < is equivalent to the fact that v belongs 
to U. Of course, the choices of A and ip are not unique. Then the 
primal problem (P) will be transformed to a mini-max problem for the 
functional Jz? (v, //) = J(v) + <p(v, //) on V x A. The new functional Jz? is 
called a Lagrangain associated to the problem (P). 
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We shall show that the primal problem is equivalent the minimax 
problem for the Lagrangain (which is a functional in two arguments 
eV X A). The interest of this method is that under suitable hypothesis, 
if (u, A) is a minimax point for the Lagrangian then u will be a solution 
145 of the primal problem while A will be a solution of the so called "dual 
max-mini problem" which is defined in a natural way by the Lagrangian 
in this method. Thus under certain hypothesis a minimax point charac- 
terizes a solution of the primal problem. 

Results on the existence of minimax points are known in the liter- 
ature. We shall show that when V is of finite dimension, under certain 
assumptions, the existence of a minimax point follows from the classical 
Hahn-Banach theorem. In the infinite dimensional case we shall illus- 
trate our method which makes use of aresult of Ky Fan [29] and Sion 
|4"T1 . [42|. However our arguments are very general and extend easily 
to the general problem. 

1 Preliminaries 

We shall begin by recalling the above mentioned two results in the form 
we shall use in this chapter. 

Theorem 1.1. (Hahn-Banach). Let V be a topological vector space. 
Suppose M and N are two convex sets in V such that M has atleast 
one interior point and N does not have any interior point of M (i.e. 
IntM + (p, N fi IntM -(p). Then there exist an FeV, F + and an aeM, 
such that 

(1.1) < F,m >vxv= F(m) <a< F(n), VmeM, VneN. 

In order to state the next result it is necessary to introduce the notion 
of minimax point or sometimes also called saddle point. 
Let V and E be two sets and 

jSf : V xE -> R 

be a functional on V x E. 
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Definition. A point (u, A)eV x E is said to be a minimax point or saddle 
point of Jz? if 

(1.2) Jz?(m,ju) < ££{u,X) < 5f(v,A), V(v,//)eV x E. 

In other words, (u, A)eV X E is a saddle point of Jz? if the point u is 
a minimum for the functional Jz? (•, i) : V 9 v h Jz? (v, A)eR, and if the 
point A is a maximum for the functional 

Jz?(w, •) :£3jui-^«5f(M,yu)eR. 

i.e. supJz^w,//) = Jz?(M) = infJz^vU). 

Theorem 1.2. (Ky Fan and Sion). Let V and E be two Hausdorff topo- 
logical vector spaces, U be a convex compact subset of V and A be a 
convex compact subset ofE. Suppose 

JSf : U x A -» R. 

&e afunctional such that 

(i) For every veil the functional Jz? (v, •) : A 9 ju i-> J§f(v,//)eR w 
upper-semi continuous and concave, 

(ii) for every peA the functional ^f(-,p) : U 3 v i-» „Sf(v,yu)eR w 
lower-semi continuous and convex. Then there exists a saddle 
point (u, A)eU X A/or Jz?. 

Lagrangian and Lagrange Multipliers 

First of all we need a method of describing a set of constraints by 
means of a functional. 

Suppose V is a Hilbert space and U be a given subset of V. In all 
our applications U will be the set of constraints. 

Let E be a vector space. We recall that a cone with vertex at in E 
is a subset A of E which is left invariant by the action of R+, the set of 
non-negative real numbers: i.e. If AeA and if aeR with a > then aA 
also belogs to A. 
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We assume that there exists a vector space E, a cone A with vertex 
at in E and a mapping 

O : V x A -> R 

such that 

(i) the mapping A 3 /i i-» 0(v, //)eR is homogeneous of degree one 

i.e. <D(v, pp.) = pO(v, p.), Vp > 0, 

(ii) a point veV belongs to U if and only if 

0(V,yU) < 0, V;U£A. 

The choice of the cone A and the mapping O with the two properties 
above is not unique in general. 

The vector space E often is a topological vector space. 

We illustrate the choice of A and O with the following example. 

Example 1.1. Suppose U is a subset of R" defined by 

U - {v|veR'\ 

g(v) = (gi(y), ■■■ , g m (v))eR m such that gi (v) < Vi = 1, • • • ,m}, 
i.e. g is a mapping of R" -> R m and gi(v) < Vi. We take 
A = {peR m \p = (fi U ■ ■ ■ ,n m ) with//; > 0} 
Clearly A is a (convex) cone with vertex at OeR" 1 . Then we define 
O : R" x A -» R 

m 

by 3>(v,yu) = 0",g(v)) R m = ^//igi(v). 
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One can immediatly check that O has the properties (i) and (ii) and 
V = {veR n ;0(v ;J u) = (M,g(v))w- < 0}. 

More generally if U is defined by a mapping g : R" —> H where H 
is any vector space in which we have a notion of positivity then we can 
take 

A = \jAneH,n > 0} 

and 

$(V,/i) -<fl,g(v) > H 'xH ■ 

Example 1.2. Let U be a convex closed subset of a Banach space V. We 
define a function h : V — > R by 

/i(ju) = sup < y«,v >' v , xy 

V6f/ 

Then clearly h >0. 

We take for the cone A: 

A = {fi\fieV',h(fi) < +00} 

and define <D : V x A -» R by 

<D(v,//) =</z,v > -/t(ju). 

It is clear from the very definition that if veV and 0(v, fj.) < then 
vei/. In fact,if v ^ f/ then, since U is a closed convex set in V, by Hahn- 
Banach theorem there exists an element iieV such that ju(m) - VueU 
and ju(v) - 1. Then for this n,h{p) = so that fi e A and <l>(v,yu) =< 
fi, v >= 1 which contradicts the fact that 0(v,yu) < 0. Hence veU. 

The arguments of Exercise 1.1 can be used to formulate the general 
problem of non-linear programming considered in Chapter 0] Given 
(k + 1) functionals J , J\, ■ ■ ■ , on a Hilbert space V to find 

ueU - {v\veV\ 7,(v) < for i = 1, • • • , K], 
J (u) = inf J (v). 

veU 
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We note that v h-> (/i(v), • • • , /&(v)) defines a mapping of V into R*. 
We take as E the space (R k )' = R k and 

^ = ^eR*,^>0,i= l,-- ,k) 

k 

<D(v,/z) - ^/ij^v). 
i=l 

It is immediately seen that <1> satisfies (i) and (ii), and that an element 
veV belongs to U if and onlu if 3>(v,/x) < 0, VjueA. So our problem can 
be reformulated equivalenty as follows: 

To find ueV such that sup^ 0(m,/x) < and 

/ (w) = inf / (v). 

{D(v,/i)<0, VpeA} 

These considerations are very general and we have the following 
simple proposition. 

Proposition 1.1. Let V be a normed space and U be a subset ofV such 
that we can find a cone A with vertex at (in a suitable vector space) 
and a function O : V X A — > R satisfying (i) and (ii). Then the following 
two problems are equivalent: Let J : V —> R be a given functional 

Primal problem: To find ueU such that J(u) = mf V£ u J(v). 

Minimax problem: To find a point (u, A)eV x A such that 

(1.3) J(u) + ®(u,p) = inf sup(7(v) + <D(v,/i)). 

veV fieA 

Proof. First of all we show that 

[ if veU 
sup0(v,Ai) = ^ 

^eA [ +oo if V t U. 

□ 

In fact, if ueU then by (ii) <&(v,p) < V/ieA. Since OeA we get by 
homogeneity (i); <b(v, 0) = and hence 

supd>(v,ju) = 0. 

/ieA 
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Suppose now v g U. Then there exists an element /ieA such that 
®(v,p) > 0. But for any p > 0, ppeA and by homogeneity 

0(v,pp) =pO(v,yu) > 

so that 0(v,pp) — > +oo as p — > +oo. This means that 

sup ®(v,//) - +oo iiv 

Next we can write 

sup(/(v) + 3>(v,//)) = /(v) + sup 3>(v,p) 

_( /(v) if vet/ 
1 +oo if v £ t/ 

and we therefore find 

inf sup(7(v) + <£(v,p)) = inf /(v). 

veV MeA vet/ 

This proves the equivalence of the two problems. 

Suppose given a functional / : V — > R on a Hilbert space V and U a 
subset V for which there exists a cone A and a function O : V x A — > R 
satisfying the conditons (i) and (ii). 

Definition 1.1. The Lagrangain associated to the primal problem for J 
(with constraints defined by the set U) is the functional Jz? : VxA^R 
defined by 

(1.4) Sf(y,n) = J(y) + dXy,n). 



l^ieA is called a Lagrange multiplier. 

The relation between the minimax problem and the saddle point for 
the Lagrangian is expressed by the following proposition. This proposi- 
tion is true for any functional Jz? on V x A. 
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Proposition 1.2. If(u, A) is a saddle point for Jz? then we have 
(1.5) supinf ££{v,ii) = ^(u,A) = inf sup if(v,//). 

P roo/ First of all for any functional Jz? on V x A we have the inequality 
sup inf Jzf (v, yu) < inf sup Jz?(v,/i). 

^eA vey /*eA 



In fact, for any point (v, p)e V x A, we have 

inf J§?(v,/z) < Jz?(v,ju) < sup if (v,/z). 

But, there the first term inf Jzf (v, //) is only a function of p. while 

veV 

su P/jeA -^( v > J") is a function only of v. Hence we get the required in- 
equality. 

Next, if (u, A) is a saddle point for Jz? then by definition 
inf supJz?(v,ju) < sup Jzf (w, //) = Jf(u,A) 



veV peA ,ueA 



- inf 5£(y,p) < sup inf Jf(y,p). 



The two inequalities together given the equalities in the assertion of 
the proposition. 

Definition. The problem of finding (w, A)eV x A such that 
(1.6) JS?(>M) = supinfJS?(v,/*) 

is called the "dual problem" associated to the primal problem, 
i.e. 



(1.6)' 



(w, A)eV x A such that 

J(w) + <b(w,A) = su PjUeA inf ve i/(7(v) + <B(v,/z)). 
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Remark. Since the choice of the cone A and the function O : V xA — > R 
are not unique there are may ways of defining the dual problem for a 
given minimization problem. 

In the following example we shall determine the dual problem of a 
linear programming problem. 

Suppose given a linear functional / : R n — » R of the form J(v) = 
(c, v)r« where ceR" is a fixed vector, a linear mapping A : R" — » R m and 
a vector &eR m . Let C7 be the set in MP. 

U = {veR"; Av -b = ((Av - % ■ • • , (Av - b) m )eR m 

(1.7) such that (Av - b)i < for all i = 1, • • • , m}. 

Consider the linear programming problem: 

(1.8) To find ueU such that J(u) - inf J(y). 

veil 

i.e. To find ueW 1 such that 
(1.8) 

Au - b < and (c, m)r« < (c, v)r« for all veR" satisfying Av - b < 0. 

We consider another linear programming problem defined as fol- 
lows. 

Let /* : R m -> R be the functional J*(+i) = (b,[i) R m and ?7* be the 
subset of R" 1 given by 
(1.9) 

U* = {w\weR m ,A*w + ceR" such that (A*w + c)j > for all j = 1, ■ • • , n}. 
where A* : R m — » R" is the adjoint of A. 
(1.10) To find fieu* such that/*(ju) - inf J*(w) 

weU* 

i.e. To find //(■/ '" such that 

(i.ioy 

A*// + c > and (b,fx)ur< < (^, w)r" for all weR m such that A*w + c > 0. 



Proposition 1.3. T/je linear programming problem ( (l.lO)^ f/je Jwa/ 
o/f/ze linear programming problem jUFft . 
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Proof. We have V = W,E = R m . Take the cone in R m defined by 

A - {fi\fie(M. m Y = R"\yu = (/ii, • • • ,n m ) with m > for all i = 1, • • • , m\ 

and the function 

<D(v,ju) = (Av - b,fi) R m. 

□ 

By the very definitions we have U = {veR n [0(v,/x) < 0}. The La- 
grangian «5f (v, //) is given by 

Jz?(v,yu) = (c,v)jr + (Av - b,n)&m. 



Hence by Definition ( (1.6)'| l the dual problem is the following: To 



find (w, A)eR n x A such that 

^(w,A) = suv, inf JSf(v,u) 

- sup inf ((c,v)r» + (Av - b,/u) Rm ). 

We can write 

Sf(v,fi) = ((A> + c),v)r. - (b,n) Rm 

and hence 

inf i?(v,ju) = inf ((A*// + c), v) R « - (b,/j.) Rm . 

veR" veR" 

If A* + + c + then by Cauchy-Schwarz inequality we have 
-|MIr«||AV + c\\ R n < (A*fi + c, v)r« 

and so 

((A*fj. + c), v)r« — » — oo as ||v|| — > +oo 

i.e. 

inf (A*/i + c, v) R « - -oo if A*// + c * 0. 
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But if A*p + c - then inf veR «(A*/i + c,v)r» - 0. Thus our dual 
problem becomes 

sup inf Jz?(v,/i) - sup -(b,p)^m - -mf(b,p)M.>»- 

In other words the dual problem is nothing but ( (1.10)'| i 
We conclude this section with the following 

Proposition 1.4. If(u, A)eV x A is a saddle point for the Lagrangian as- 
sociated to the primal problem then u is a solution of the primal problem 
and A is a solution of the dual problem. 

Proof, (u, A) is a saddle point for the Lagrangian J2? is equivalent to 
saying that 

(1.11) J(u) + <$>{u,p) < J{u) + <D(«, A) < J(v) + f(v, A), V(v,/z)eV x A. 

□ 

Form the first inequality we have 

(1.12) <D(«,yu) < ®(u,A),VpeA. 

Taking p. = in this inequality we get cp(w,0) < ®(w, A) which 
means by homogeneity ®(m, A) > 0. Similarly taking u = 2A and using 
homogeneity we get 

2®(w, A) - <D( M , 2A) < ®(w, A) 
i.e. <D(«,/l)<0. 

Hence we find that ®(m, A) = 0. Then it follows from ( 11.121 1 that 

®(u,p) < 0, V/ieA 

and therefore ueU by definition of A and <1>. Thus we have 
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(1.13) 



ueU, AeA, $>(u, A) - and 

J(u) + ®(u, A) < J(v) + <D(v, A) VveV 
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Conversely, it is immediate to see that d!.13t implies (11.1 II ). It is 
enough to observe that (E>(u,/j.) < = ®(w, A) V//eA since uell so that 
we have the inequality 

J(u) + <D(m,//) < J{u) + <D(m, A). 

Now in d 1 . 13l > we take veU so that ®(v, fi) < 0, V//eA and dl. 13i will 
imply 

[ ueU, AeA, <D(m, A) = and 

1.14 

[ /(b) < /(v) Vvet/. 

which proves that m is a solution of the primal problem. We have al- 
ready seen in Proposition 1 1 . 1 1 the implication that if u is a solution of the 
problem then 

Sf(u,A) = inf supJS?(v,//). 

155 

On the other hand, if we use proposition 11.21 it follows that A is a 
solution of the dual problem. 



2 Duality in Finite Dimensional Spaces Via 
Hahn - Banach Theorem 

In this section we describe a duality method based on the classical Hahn- 
Banach theorem for convex programming problem in finite dimensional 
spaces i.e. our primal problem is that of minimizing a convex functional 
on a finite dimensional vector space subject to constraints defined by 
convex functionals. 

We introduce a condition on the constraints which is of fundamental 
importance called the Qualifying hypothesis. Under this hypothesis we 
prove that if the primal problem has a solution then there exists a saddle 
point for the Lagrangian associated to it. We shall also give sufficient 
conditions in order that the Qualifying hypothesis on the constraints are 
satisfied. 
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Let Ji : R n -> R(i = 0, 1, • • • , it) be (it + 1) convex functional on W 
and K be the set defined by 

K = {v|v6R"; / { (v) < for / = 1, • • • , it}. 

Our primal problem then is 

Problem 2.1. To find ueK such that J (u) = inf ve ^ / (v). 
It is clear that K is a convex set. 
Let 

(2.1) j = MJ {v) 

veK 

We introduce the Lagrangian associated to the problem (12. li as de- 
scribed in the previous section. More precisely, let 

A - {/u\fi = (jj. u - ■ ■ ,fi k )eR k such that fi t > 0} 

which is clearly a cone with vertex as in M. k and let 

$ : 1" x A -> 1 

be defined by 

k 

i=i 

Then the Lagrangian associated to the problem ( 12. II ) is 

k 

^(v,fi) = J (v) + ^iJi(v)- 

Suppose that the problem (12.11) has a solution. Then we wish to find 
conditions on the constraints /, in order that there exists a saddle point 
for ££ . For this purpose we proceed as follows: 

Suppose S and T are two subsets of R k+l defines in the following 
way: 
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S is the set of all points 

(Jo(v) - j + Jo, /i(v) + s u ■ ■ ■ , J k (v) + s k )eR k+1 , 
where veR" and 

s = (s , si,-- - , s k )eR k+l such that s t > V/. 

T is the set of all points 

(-t , -ti, ■■■ , -t k )eR k+l where t { > V/. 

It is obvious that T is convex. In fact T is nothing but the negative 
cone in R* +1 . On the other hand, since J ,J\,--- ,J k are convex and 
Si > V? it follows that S is also convex. It is also clear that Int T + <p. 
In fact any point (-t Q , -t\, ••• , -ti c )€E k+1 with ti > V? is an interior 
point. 

Next we claim that S n (Int T) = (p. In fact, if S n (Int T) + (f> then 
there exist 

some teR k+l with t = (t , t\,--- , t k ), t t > V/, 



some veR", and an jeR^" 1 " 1 with s - (s , s\, • • • , s k ), Sj > V/ 
such that 

Jo(v) - j + So - -t ,J\{v) + si = -h, • • • , J k (v) + s k = -t k 
Now we have form this 

Ji(v) - -ti - Si < since j; > for any/ = 1, • • • , k 
This means that veK. On the other hand, 

^o(v) - -to - s + j < j - inf 7 (w) 

weK 

which is impossible since veK. 
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We can now apply Hahn-Banach theorem to the sets S and T in the 
form we have recalled in Section[2 There exist an Fe(R, k+l )' - (R^ +1 ) 
and an aeR such that F + 0, F(x) > a > F(y) where xeS and yeT . More 
precisely we can write this as follows: 

k 

3F = (a Q ,ai, ■ ■ ■ , ak)eR k+l such that V > and 3aeR 

i=0 

such that 



(2.2) 



a (J (v) - j + s ) + 2*^ a,(/,(v) + s{) > a > - £- = o 
VveV, 5 = (s , , 5^) with jj > V/ 

and * = (f , ?i, • • • , t k ) with t\ > V/ 



We next show from ( 12.21) that we have 



(2.3) 



In fact, if we take t\ 
inequality in d2.2t . 



a = 0, a t > V/ and V a t > 0. 

• • = tk — then we get, from the second 

a > -a f = {-a )t Vf a > 0. 



If a Q < then (-a )f — > +°° as t — » +oo and therefore we neces- 
sarily have a > 0. Similarly we can show that or,- > Vi = 0, 1, • • • , k. 
Then 

k k 
^ |a ( | = ^ a,- > since F * 0. 

t=0 i=0 

If we take t a = t\ = • • • = tk — we also find, from the second 
inequalities in d2.2t that a > 0. 

We have therefore only to show that a < 0. For this, taking So = 
• • • = Sk - in the first inequality of ( 12.21 ) we get 



(2.4) 



k 

a (J {v) - j) + V aiJi{v) > a. 
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Suppose v m is a minimizing sequence for the problem ( 12. II ) 
i.e v m eK and J (v m ) -> y = inf /„(v). 

This means that Ji(v m ) < for i = 1, • • • and J (v m ) — > 7. Hence 
( 12.41 ) will imply, since a, > 



ao(/o(v m ) - j) > a (7 (v m ) - j) + ar,7i(v) > a. 
Now taking limits as m — > +oo it follows that a < 0. Thus we have 



(2.5) 



or, > 0, for i - 0, 1, • • • , k and X? = o a; > 0, 
ar (/o(v) - j) + Z*=i atj/iCv) > 0, VvtR" 



We now make the fundamental hypothesis that 

(2.6) a > 0. 

Under the hypothesis ( 12.61 ) if we write Aj = ai/a then ( I2.5t can be 
written in the form 



(2.7) 



> for i = 1, • • • , k and 

J < Mv) + XU Vf(v).VveR" 



159 i.e. /leA and J§f(v, A) > j VveR". 

The condition ( I2.6t is well known in the literature on optimization. 
We introduce the following definition. 

Definition 2.1. Any hypothesis on the constraints 7,- which implies ( 12.61 ) 
is called a Qualifying hypothesis. 

We shall see a little later some examples of Qualifying hypothesis. 

(See m, 123. BO)- 

We have thus proved the 
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Theorem 2.1. If all the Junctionals Jj(i — 0, 1, • • • , k) are convex and if 
the Qualifying hypothesis is satisfied then there exists a AeA such that 
Sf(y, A) > j Vv£R". 

i.e. there exists a A = (A\,- ■ ■ , Ai c )eM k with A t > V/ such that 

k 

Jo(v) + J] AiJi{v) > j, VveR". 
(=i 

We can also deduce from \2.7l the following result. 

Theorem 2.2. Suppose all the functionals J ,J\,--- ,Jk we convex and 
the Qualifying hypothesis holds. If the problem \2.l\ has a solution, i.e. 

(2.8) there exists a ueK such that J (u) — j — inf J (v) 

veK 

then the lagrangian Jzf has a saddle point. 
Proof. We can write as 

Ai > for i = 1 , • • • , k and 

k 

(2.9) J (u) < J (v) + ^ A, Ji(v) = 5e (v, A), VvdR" . 

(=i 

Choosing v = u in ( T2.9t we find that 

k 

YjMiiu) >0. 

(=i 

But here Ai > and /,(«) < since ueK so that /1,7,(m) < for all 160 
i - 1, • • • , k and hence Yh-\ ^ 0- Thus we necessarily have 

k 

i=l 

and, further more, it follows immediately from this that 
AjJi(u) = for i = 1, • • • , k. 
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Thus we can rewrite ( I2.9t once again as : 
Ai>0,i = l,--- ,k. 

(2 10) J /l! * 7,( "' ) = ° 

&{u, A) = /„(«) + 2* =1 < -/o(v) + 2* = i Af/,(V) 

= .S?(v,/l) VvdR". 

But, since ueK, Ji(u) < and we also have 
(2.11) 

C JSf («, m) - /o(m) + Z*=i M(«0 < 7o(u) - Jo{u) + ZU Wi(u) 

j = JS?(u, A) 

( VjueR^withju^ Oui,--- ,n k \m>Q. 

( 12.101 ) asnd ( 12.111 ) together means that 

if(w,M) < (u, /I) < JSf(v, A), VveR" and V/zeA. 

This proves the theorem. □ 



Some examples of Qualifying hypothesis. We recall that if all the 
functionals J ,J\,--- , J k are convex then we always have ( 12.51 ) VveR". 
If suppose a = in ( 12.51 ) then we get 



(2.12) 



or; > for i - 1 , • • • , k, Z or,- > and 
(=1 

k 

Z a*7/(v) > 0, VveR" 



In all the examples we give below we state the Qualifying hypoth- 
esis in the following form. The given hypothesis together with the fact 
that a Q = will imply that it is impossible that (12.5b holds, i.e. The hy- 
pothesis will imply that d2.12b cannot hold. Hence if ( 12.51 ) should hold 
we necessarily have a > 0, i.e. (12.61) holds. 

Qualifying hypothesis (1). There exists a vector ZeW 1 such that J,-(Z) < 
for i = 1, • • • , k. 

This condition is due to Slater (See for instance [6]). 
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Suppose the Qualifying hypothesis (1) is satisfied. Let ceR be such 
that /,(Z) < c < for all i = 1, • • • , k. Obviously such a constant c 
exists since we can take c = max J,-(Z). Now if or; > 0(i = 1, • • • , k) 

\<i<k 

k 

are such that £ a,- > then 

1=1 



^^•(Z^cJV^O. 

i=l 1=1 

This means that (12. 1 2ft does not hold for the vector ZeR". Hence 
a > necessarily so that ( 12.51 ) holds WveW 1 and in particular for Z. 



Qualifying hypothesis (2). There do not exist real numbers 



(2.13) 



k 

ai(i = 1, • • • , k) with a, > and ^ a, > such that 

i=i 

Z a,V,-(v) = 0,Vve*. 
(=1 



Suppose this hypothesis holds and a = 0. Then we have ( 12.121 ) for 
all veR". 

In particulas, we have 



2a«4(v)>0,Vve*. 
(=i 

But ve^ and a,- > imply that a;7;(v) < for i = 1, • • • , k and so 

k 

ajJi(v) < 0. The two inequalities together imply that 3ai > with 

i=i 

k k 

aj > such that £ ar,-/,-(v) - 0, contrary to the hypothesis. Hence 
i=i i=i 
ar D > 0. 



Qualifying hypothesis (3). Suppose /,(/ = 1, • • • , k) further have gradi- 
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ents Gj(i = 1, • • • ,k). 



There do not exist real numbers o-, with 

k 



(2.14) 



a, > 0, / = 1, • • • , k, Yi ®i > such that 




£ a,-G/(v) = 0, VveK, 



The condition (12. 14ft seems to be due to to Kuhn and Tucker [ 28 ] 



It is enough to show that Qualifying hypothesis (3) implies Qual- 
ifying hypothesis (2). Suppose there exist a, > 0,/ = l,-- - ,k, with 

k 

X cmJi(v) = VveK. Then taking derivatives it will imply the existence 



of or/ > 0(i = 1, • • • , k) with £ or; > such that 2 a;G,(v) - VveK. 

i=i (=i 
This contradicts the given hypothesis. Hence a > 0. 

Finally we remark that the existence of a saddle point can also be 

proved using the minimax theorem of Ky Fan and Sion. We refer for 

this to the book of Cea [6|. 

3 Duality in Infinite Dimensional Spaces Via 
Ky Fan - Sion Theorem 

This section will be concerned with the duality theory for the minimi- 
sation problem with constraints for functionals on infinite dimensional 
Hilbert spaces. We confine ourselves to illustrate the method in the spe- 
cial example of a quadratic form (see the model problem considered 
in Chapter ^ Section in which case we have proved the existence 
of a unique solution for our probelm (see Sectional of Chapter 0). As 
we have already mentioned this example includes a large class of vari- 
ational inequalities associated to second order elliptic differential oper- 
ators and conversely. Our main tool in this will be the theorem of Ky 
Fan and Sion. However we remark that our method is very general and 
is applicable but for some minor details to the case of general convex 
programming problems in infinite dimesional spaces. 



k 
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3.1 Duality in the Case of a Quadratic Form 

We take for the Hilbert space V the Sobolev space H l (Sl) where SI is 
a bounded open set with smooth boundary T in R". Let a(-, ■) be a 
continuous quadratic form on V (i.e. it is a symmetric bilinear bicon- 
tinuous mapping: V x V — > R) and L(-) be a continuous linear func- 
tional on V (i.e. LeV). We assume that a(-, •) is H l (Sl) - coercive. Let 
J : H l (Sl) — » R be the (strictly) convex continuous functional on H l (Sl) 
defined by 

(3.1) J(v) = ^a(v,v)-L(v). 

We denote by ||| • ||| the norm || • Wh^q) an d by || • || the norm || • H^^). 
Let us consider the set 

(3.2) ^{v|v6// 1 (Q),||v|| < 1}. 

We check immediately that K is a closed convex set in H (SI). We 
are interested in the following minimisation problem : 

Problem 3.1. To find ueK such that J(u) < J(v), SveK. 

Since J is H l (Sl) -coercive (hence strictly convex) and since / has a 
gradient and a hessian everywhere in V we know from Theorem 12.11 
that the problem I^TTI has unique solution. 

In order to illustrate our method we shall consider a simple case and 
take 

(3.3) A = {fi\fi€R,fi > 0} 
and 

(3.4) <D(v,ju) - ^u(||v|| 2 - 1) VveV - H\S1) and fieA. 

Thus K is nothing but the set {v|veV, ®(v,yu) < 0}. We define the 
associated Lagrangian by 



JSf(v,/i) = 7(v) + 0(v,yu) 
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i.e. 

(3.5) Jgf(v,/i) = io(v, v) - L(v) + ^(||v|| - 1). 
We observe that 

(i) the mapping ju i-» =Sf (v, /i) is continuous linear and hence, in par- 
ticular, it is concave and upper-semi-continuous and 

(ii) the mapping v h> «Sf (v, /i) is continuous and convex and hence in 
particualr, it is convex and lower semi-continuous. 

We are now in a position to prove the first result of this section using 
the theorem of Ky Fan and Sion. This can be stated as follows: 

Theorem 3.1. Suppose the functional JonV = H^Q.) is given by \3. 1 t 
and the closed convex set KofV is given by A3.2t . Then the Lagrangian 
\3. 51 1 associated to the primal problem \3.1\ has a saddle point. Moreover, 
if (u, A) is a saddle point of Jzf then u is a solution of the generalized 
Neumann problem 

+Au + uAu = f inQ 

(3.6) d/dn^u = on T 

We note that here u and A are subjected to the constraints 

(3.7) A > 0, ||u|| < 1 but A(\\u\\ 2 - 1) = 0. 

Here the formal (differential) operator A is defined in the following 
manner. For any fixed veV = H l (Q.) the linear mapping ip i-> a(y, tp) is 
a continuous linear functional Av i.e. AveV. Moreover v h» Av belongs 
to ££ (V, V) and we have 

(Av,ip) v = a(v,(f),V(peH l (Q) - V. 

Similarly feL 2 (Q) is defined by L(tp) - (f,ip) L 2^ sl - ) ,V<peV. Further 
du/dnA is the co-normal derivative of u associated to A and is defined 
by the Green's formula: 



a(u, cp) - (Au, (p)v + J dudnA<pdo-, VipeV, 
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as in Section|3]of Chapter|2] 

In particular, if we take a(v, v) = |||v||| 2 , then A = -A and the problem 
is nothing but the classical Neumann problem 

| -Aw + u + Au = f in Q, 
(3.6)' { J 

I du/dn = on T 

Of course, we again have d3.7t . 

Proof of Theorem 3.1. Let £ > be any real number. We consider the 
subsets K[ and of H(Q.) and A respectively defined by 

K { = {v\veH l (n),\\\v\\\<£} 
At = [nl/ieR, 0</j,<£] 

It is immediately verified that K{ and Ac are convex sets, and that 
A{ is a compact set in R. Since K{ is a closed bounded set in the Hilbert 
space H l (£l), K{ is weakly compact. We consider H l (£l) with its weak 
topoligy. 

Now H l (£l) = V with the weak topology is a Hausdorff topological 
vector space. All the hypothesis of the theorem of Ky Fan and Sion are 
satisfied by Kt, A{ and _Sf in view of (i) and (ii). Hence : K(XA[ — > R 
has a saddle point {uf, A{). i.e. 

There exist (ue, Ae)eK( x Ac such that 

J(ue) + lfi(\\\u { \\\ 2 - 1) < J(u t ) + \UW\ud\ 2 - 1) 

< J(y) + ^Mlllvlll 2 - 1). 
V(v,/i)rf f xA f . 

We shall show that if we choose { > sufficiently large then such a 
saddle point can be obtained independent of £ and this would prove the 
first part of the assertion. For this we shall first prove that and At 
are bounded by constants independent of £. 

If we take /u = OeAf in ( 13.81) we get 



(3.9) 



7(^)<7(v) + -MIMI 2 -l),Vve^ 
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and, in particular, we also get 166 

(3.10) J{u { ) < J(v), VveKnKe. 

Taking v = e K n K t in dXTUb we see that J{u t ) < J(0)(= 0). On 
the other hand, since a(u(, U{) > and since uc e Ki 

Uu ( )<\\L\\ vl \\\u e \\\ <t\\L\\v> 

we see that ^ 

J(u{) = -a{u e ,ue) - L{u { ) > -^||L||v 

which proves that J{ui) is also bounded below. Thus we have 

(3.11) e\\L\\v> < J(Mt) < 7(0). 
Now by coercivity of a(-, •) and ( 13. Ill ) we find 

a\\\u e \\\ 2 < a{u e ,u e ) = 2{J(u e ) + L{u e )) < 2(7(0) + ||L|MIMII)- 

with a constant a > (independent of €). Here we use the trivial in- 
equality 

\\L\\wWutW < e\\\u t t + Ve\\L\\ 2 v ,. for any e > 0. 
with e = a/4 > and we obtain 

||| M dll 2 <4/a(7(0) + 4A*||L|| 2 y ,) 
This proves that there exists a constant c\ > such that 

(3.12) IIMII<ci,W. 

To prove that A[ is also bounded by a constant C2 > independent 
of I, we observe that since 7 satisfies all the assumptions of Theorem 
167 1213. II of Chapter EJ (Section there exists a unique global minimum in 
V = H\Q) i.e. 

(3.13) There exists unique a.HeH l (Q.) such that J(u) < 7(v), VveV. 
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7(h) + Ae/2 < J(u e ) + Ae/2. 
But, if we take v = OeK[ in the second inequality in ( 13.91 ) we get 

J(u t ) + Ae/2 < 7(0). 
These two inequalities together imply that 
Af/2 < 7(0) - J(u). 

i.e. 

(3.14) < At < 2(7(0) - 7(5)) = c 2 
which proves that ^ is also bounded. 

(3.15) We choose t > max(ci, 2c 2 ) > 0. 

Next we show that ( 13.8ft holds for any /ieA. For this, we use the first 
inequality in d3- 8I > in the form 

H(\\ue\\ 2 - 1) < MIMI 2 II - 1). 

This implies (i) taking [i = 0, Ae(\\ue || 2 - 1) > and 

(ii) taking = 2^ < 2c 2 < f, Ac, Ae(\\ue || 2 - 1) < 0. Thus we have 

^(ll«f II 2 - 1) - and fi(\\u e || 2 - 1) < 0, V/zeA^. 
In particular, /u = (eAf and so ^(||m^|| 2 - 1) < 0. Thus we have 

MM 2 - 1) - and MINI 2 - D < 0,V/zeA* 
In particular, // = ^eA^ and so £(||w£ || 2 - 1) < which means that 

INI 2 -i<0. 

Hence we also have 

MINI 2 - 1) < Ofor any// > 0. 
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and therefore 

(3.16) Jzf {u { , n) < Se(u t , A t ) < ££(y, A e ), V// > and veK ( 

where £ > max(ci , 2c2). 

We have now only to show that we have ( 13.161) for any veH l (Q.) = V. 
For this we note that \\\u(\\\ <c\<1 and hence we can find an r > such 
that the ball 

B(u f ,r) = {v\veH l (n);\\\v-u e \\\<r} 

is contained in the ball B(0J) = {vlveH^D.), \\\v\\\ < £}. In fact, it is 
enough to take < r < (£ - ci)/2. Now the functional Jzf (-,/t/) : 
v h» Jzf(v, Ac) = J(v) + A(/2(\\v\\ 2 - 1) has a local minimum in B(ui, r). 
But since this functional is convex such a minimum is also a global 
minimum. This means that 

inf 5£(v,A t ) =inf J£f(v,^). 

veR(u(f) veV 

On the other hand, since B(uc, r) c Kf we see from d3.16t that 

&(u e ,fi) < Jz?( Mf , A e ) < inf _S?(v, Af) < inf Jzf(v, ^) = inf if(v, 

veKf veB(ii[,r) veV 

In other words, we have 

^(u e ,fx) < ££(u t , A f ) < if (v, VveV and V// > 

which means that J?f has a saddle point. 

Finally we prove that (u, A) = (uc,A()(£ > max(ci,2c2)) satisfies 
d3.6t . First of all the functional v i-> Jzf (v, A) is G-differentiable and has 
a gradient everywhere in V. In fact, we have 

(3. 17) ((gradLSf )(v), <p) v = a(v, <p) - L(<p) + A(v, <p) v . 

We know by Theorem 121 1 . 31 (Chapter 121 Section Q that at the point 
u where v i-» Jzf (v, /t) has a minimum we should have 

(3. 18) ((gradJf(; A))u, <p) v = 0. 

Now, if we use ( 13.171 . d3.18t and the definition of Am, f and duldriA 
we obtain (13.6b . 

This proves the theorem completely. 
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Remark 3.1. The above argument using the theorem of Ky Fan and Sion 
can be carried out for the functional / given again by i3.ll but the con- 
vex set K of d3-2l > replaced by any one of the following sets 



K Y - [v\veHl(Q),v > a. e. in Q}, 

K 2 = {v\veH l (Q.),y v > a. e. on T] and 

K 3 = {v\veH l (0), 1 - graSuix) > a. e. in fl}. 

Since veH l (Q), y veH^(T), 1 - grad 2 u{x)eL l (Q.) and since 

(HliQ))' = H~ l (£L),(HHT)y = H-HD,(L-\n))' = L°°(Q) 

we will have to choose the cone A respectively in these spaces. 

We recall that if £ is a vector space in which we have a notion of 
positivity then we can define in a natural way a notion of positivity in 
its dual space E' by requiring an element fieE' is positive (i.e. /u > 
in E') if and only if < fi,(p >e>xe^ 0, VtpeE with <p > 0. For the 
above examples we can take for E the spaces //,J(Q),//2(r) and L\Sl) 
respectively and we have notions of positivity for their dual spaces. 

We can now take 

Ai = {^eH'\0)\n > Oinft}, 

A 2 = {n\neH~ l i{T),n > on T} and 
A 3 = {/^6L°°(n),// > OinQ}. 

and correspondingly the Lagrangians 

5?i(v,iS) = J{v)+<n,v > H i ( n)xHl(Q)> 
^2.(v,u) = J(v)+ < u,y a v > i i and 

Sf3(v,n) = J(v)+ <//,v > L - (n)xi i (n) ■ 

We leave other details of the proof to the reader except to remark that 
A, being cones in infinite dimensional Banach spaces the sets A,/(/ = 
1, 2, 3) for any ( > will only be convex sets which are compact in the 
weak topologies of H~ 1 (Q.) and H~i(T) for i = 1,2 and in the weak * 
topology of L°°(Q.) for i = 3. 
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3.2 Dual Problem 

We once again restrict ourselves to the problem considerer in B.ll i.e. / 
is a quadratic form on V = given by O.U and the closed convex 

set K is given by d3.2t . We shall study the dual problem in this case. We 
take A and O as before. 

We recall that the dual problem is the following: 

To find (u, A)eV x A such that 

Sf(u,A) = supinf Sf(y,n) 

1 1 , 

= supinf{-a(v, v) - L(v) + -//(IN - 1)}- 

^QveV 1 Z 

We fix a /u > 0. 

First of all we consider the minimization problem without constrains 
for the functional 



on the space V = H l (O). We know from Chapter|2(Theorem|2 12.lt that 
it has a unique minimum u M eV since «5f(-, /i) has a gradient and a hessian 
(which is coercive) everywhere. Moreover, {grad^£ (■,fi))(u fi ) = i.e. 
we have 



We can write using Frechet-Riesz theorem 

a(u, <p) = {{Au, (p)), L(tp) = ((F, (p)), (u, <p) = {{Bu, <p)) 

where ((•, ■)) denotes the inner product in H l (Q.) and Au,F,BueH Y (Q.). 
Then ( 13.191 1 can be rewritten as 



Hence the unique solution u^eV of the minimizing problem without 
constrainer for ££ (-,//) is given by 




(3.19) 



a(u M , if) - L(<f) + n(u M , <p) = 0, "iipeV. 



(3.20) 



Au^ - F + jiBu^ - 0. 



(3-21) 



= (A + jiB)~ F. 



We can now formulate our next result as follows. 
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Theorem 3.2. Under the assumptions of Theorem I3.il the dual of the 
primal Problem \3.1\ is the following. : 

To find AeA such that J* (A) - inf^ J* (p.), where 

(3.22) J*(p) = ((F,u^)+p.i.e. 

Dual Problem (3.2). To find A>0 such that J*(A) = inf p > J*(p). 
Proof. Consider 

Sf( Ufl ,p) = ^((Au M , u M )) - ((F, u M )) + jpiWu^U 2 - 1) 

= ^((AiV u^)) - {{F, u^)) + jpdiBu,,, u^)) - 1) 
-(((A + pB)u^ Uil ) - (F, V) - M/2. 



□ 172 



Now using ( 13.201 1 we can write 

&(u M ,n) = -U(F,u M ))-p/2 = -l{((F,u M ))+p} 



2 

Thus we see that 



supinf 3?(v,p) = supi-^idF,^)) + p} 

H>0 veV fi>0 1 



2 p>o 



which proves the assertion. 

We wish to construct an algorithm for the solution of the dual prob- 
lem A3.2I ). We observe that in this case the constraint set A = {p\peR,p > 
0} is a cone with vertex at OtR. and that numerically it is easy to compute 
the projection on a cone. In face, in our special case we have 



Pa(p) = 



\p if p > 
1 otherwise . 
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Hence we can use the algorithm given by the method of gradient 
with projection. This we shall discuss a little later. We shall need, for 
this method, to calculate the gradient of the cost function J* for the dual 
problem. 

Form d3-22l > we have 

Taking G-derivatives on both sides we get 

(3.23) (grad J*)(p) = J*Qi) = ((F, up) + 1 

where u' M is the derivative of with respect to /j,. In order to compute 
u'^ we differentiate with equation d3.20t with respect to fi to get. 

Au'^ + ixBu^ + Bu^ = 

and so 

(3.24) u'^ = -(A + /uB)- l Bu M . 
Substituting $F2M in d3~2"3l we see that 

J*Qi) = -((F,(A + nB)- l B UtI )) + 1. 

Since a(-, •) is symmetric A is self adjoint and since (•, •) is symmet- 
ric B is also self adjoint. Then (A + /uBy 1 is also self adjoint. This fact 
together with ( 13.211) will imply 

J*(p) = -((A + fiB)~ l F, Bu^) + 1 = -(up, Bu M ) + 1 

This nothing but saying 

(3.25) J*(ju) = 1 - \\u M \\ 2 

Remark 3.2. In our discussion above the functional <I> is defined by (13-41) 
and we found the gradient of the dual cost function is given by 13.251 
More generally, if ®(v,//) = (g(y),fi) then the gradient of the dual cost 
function can be shown to be J* (p.) = -g(u^). We leave the straight 
forward verification of this fact to the reader. 
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3.3 Method of Uzawa 

The method of Uzawa that we shall study in this section gives an algo- 
rithm to construct a minimizing sequence for the dual problem and also 
an algorithm for the primal problem itself (see [6], [49]). The important 
idea used is that since the dual problem is one of minimization over a 
cone in a suitable space it is easy to compute the projection numerically 
onto such a cone. The algorithm we give is nothing but the method of 174 
gradient with projection for the dual problem (see Section |5]of Chapter 
01. We shall show that this method provides a strong convergence of the 
minimizing sequence obtained for the primal problem while we have 
only a very weak result on the convergence of the algorithm for the dual 
problem. 

In general the algorithm for the dual problem may not converge. 
The interest of the method is mainly the convergence of the minimizing 
sequence for the primal problem. 

We shall once again restrict ourselves only to the situation consid- 
ered earlier i.e. J, K, A, <I> and J2? are defined by d3.ll ) - ( 13.51 ) respectively. 

Algorithm. Let A be an arbitrarily fixed point and suppose A m is deter- 
mined. 

We define A m+ \ by 

(3.26) A m+ i = P A (A m -pJ*(A m )). 

where Pa denotes the projection on to the cone A and p > 0. 
In our special case we get, using ( I3.25I ). 

(3.26)' A m+i = P A (A m -p(l - ||M m || 2 )) 

where u m - u\ m is the unique solution of the problem 

(3.20)' Au m + A m Bu m = F. 

i.e. 



(3.21)' 



u m = (A + A m B) l F. 
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We remark that d3.21t is equivalent to solving a Neumann problem. 
In the special case where a(v, v) = |||v||| 2 we have to solve the Neumann 
problem 



(3.20)' 



I Au m + (1 + A m )u m = F in O, 
\du m /dn =0onT 



i.e. At each stage of the iteration we need to solve a Neumann problem 
175 in order to determine the next iterate A m+ \. 

We shall prove the following main result of this section. 

Theorem 3.3. Suppose the hypothesis of Theorem I3.il are satisfied. 
Then we have the following assertions. 

(a) The sequence u m = u\ m determined by (3.20) , | converges strongly 
to the (unique) solution of the primal Problem \3.1\ 



(b) Any cluster point of the sequence A m determined by (3.26/ 
solution of the dual Problem 3.2. 



is a 



The proof of the theorem is in several steps. For this we shall need a 
Taylor's formula for the dual cost function J* (i.e. the functional (13.221 1) 
and an inequality which is a consequence of Taylor's formula. 

Taylor's formula for J*. Let A,peA and we consider the problem 

(3.27) (A + AB)u = F and (A + pB)v = F 

where we have written — v and u\ = u. We can also write the first 
equation as 

(A + AB)v = F -in- A)Bv = (A + AB)u - (p - A)Bv 

i.e. 

(A + AB)(v -u) = -(ju- A)Bv. 
Similarly we have 

(A + pB)(v -u) = -(p- A)Bu. 
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which implies that 



(3.28) 



Ufj - u\ = v - u = -{jjl - A)(A + fiB) Bu A 



Then (13.22ft together with (13.28b gives 

J*Qi) = J\A) + ((F, u M - u A ) + (p. - A)) 

= J*(A) -in- A)((F, (A + nBT l bu A )) + fi-A 
= J* (A) A)(((A + fiB) l F, Bu A )) + fi ~ A 

since (A + fiB)~ l is self adjoint because a(-, •) is symmetric and (•, •) is 
symmetric. Once again using the second equation in d3.27l > we get 

J*Qi) = J* (A) -Qi- A){{u,, Bu A )) + (jj-A) 

= J*(A) - (p - A)(u A , u A ) + (p. - A) - (ji - A){u A - u M , u A ) 

where we have used ((•, B-)) - (•, •)■ i.e. We have 

(3.29) J*{ii) = J\A) + Qi - A)[\ - \\u A \\ 2 ] -ijx- A)iu A - Uft , u A ). 

We shall now get an estimate for the last term of d3.29t . From (13.28ft 
we can write 

(((A + fiV)(y - b), v - «)) = -Qi- A)HBu, v - «)) 

which is nothing but 

aiy — u, v — u) + //(v — u,v — u) - —ip. — A)iu, v — u). 

Using coercivity of a(-, -),fiiv - u,v — u) > on the left side and 
Cauchy-Schwarz inequality on the side we get 



ar|||v - w||| 2 <\fi~ /illlNlllllv - u\ 



i.e. 



(3.30) 
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On the other hand, since u is a solution of d3.20t . we also have 
a(u, u) + A(u, u) - L(u) 
from which we get again using coercivity on the left 

ollMII 2 < ||L||y|NII < NIIMII, for some constant N > 0. 

i.e. 

IIMII < N/a. 

On substituting this in (13.30ft we get the estimate 
|||v- M |||<^-i|/a 2 
which is the same thing as 

(3.31) ||| M/J - w^lll < N\n - A\/a 2 . 

Finally d3.29t together with this estimate d3.31l > implies 

(3.32) 7*0") < J*(A) + 0u - A)(l - \\u A \\ 2 ) + N\ - A\ 2 /a 3 . 

Proof of Theorem 3.3. 

Step 6. J*(A m ) is a decreasing sequence and is bounded below if the 
parameter p > is sufficiently small. We recall that A m+ \ is bounded as 

<*m+l = P\(Am -P(l - ll«ml| 2 ))- 

We know that in the Hilbert space R the projection P onto the closed 
convex set A is characterized by the variational inequality 

(A m -p(l - \\u m \\ 2 ) - A m +i,(i - A m+ ])* < 0, V/ieA. 

i.e. we have 

(3.33) (A m -p(l - || Mm || 2 ) - A m+l )(M ~ < 0,V/xeA. 
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Putting /j = A m in this variational inequality we find 

(3.34) \A m - A m+1 \ 2 < p(l - \\u m \\ 2 )(A m - A m+1 ) 

On the other hand ( I3.32t with p = A m+ \,A = A m ,u\ = u m (= u^ m ), 
becomes 

J*(A m +l) ^ J* (Am) + (A m +l ~ A m )(\ - ||w m || 2 ) + M\A m +\ - A m \ 2 

where M is the constant N 2 /a 3 > 0. If we use (13.341) on the right side of 
this inequality we get 

J*(A m +i) < J*(A m ) - l/p\A m+ i - A m \ 2 + M\A m+ i - A m \ 2 

i.e. 

(3.35) J*(A m+l ) + (1/p - M)\A m+l - A m \ 2 < J\A m ). 

Here, 1/p - M would be > if we take < p < 1/M = a 3 /N 2 , 
a fixed constant independent of We therefore take pe]0, 1 /M [ in the 
definition of A m +\ so that we have 

J*(A m +i) < J (A m ), 

which proves that the sequence J*(A m ) is decreasing for < p < 1/M. 
To prove that it is bounded below we use the definition of J* (A) and 
Cauchy-Schwarz inequality: From (13.221) 

J*(A) = ((F, u x )) + A> -|||F|||||| M/1 ||| > -N/a\\\F\\\ 

since HIm^IH < N/a. This proves that J*(A m ) is bounded below by 
-A^/alllFIH, a known constant. 

Step 7. By step 1 it follows that J*(A m ) converges to a limit asm -> +oo. 
Moreover, (13.351) will then imply that 

(3.36) \A m+ \ - A m \ 2 — > as m — > +oo. 
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Step 8. The sequence A m has a cluster point in R. For this, since J*(A m ) 
is decreasing we have J*(A m+ \) < J*(A ) i.e. we have 

((F, M m+ l)) + i w+ l < ((F, Mo)) + /lo 

and the right hand side is a constant independent of m. So, by Cauchy- 179 
Schwarz inequality, 

A m +\ < {{F,u Q - u m+ x)) + A < ((F,w )) + A Q + |||w m+ i||||||F|||. 

But |||w m+ i||| is bounded by a constant (= N/a) and hence 

0<A m+l <((F,u o )) + A o +N\\\F\\\/a. 

i.e. The sequence A m is bounded. We can then extract a subsequence 
which converges. 

Similarly, since u m is a bounded sequence in H l (Q.) there exists a 
sub-sequence which converges weakly in H l (Q.). Let {m'\ be a subse- 
quence of the positive integers such that 

A' m — > A* in R and u m > - u^ m , — k m* in H 1 (D.). 

Step 9. Any cluster point /I* of the sequence A m is a solution of the dual 
problem 13.21 

Let A m > be a subsequence which converges to A* . We may assume, 
if necessary by extracting a subsequence that u m > — k w* in H [ (Q.). By 
Rellich's lemma the inclusion of H l (Q.) in L 2 (H) is compact (since Q is 
bounded) and hence u m > — > m* in L 2 (Q). Then «* satisfies the equation 

(3.37) u*eH\Q),Au* + A* Bu* = F. 

To see this, since u m ' is a solution of 0(3.20)') we have 

((Au m >, if)) + Am'i^Bu,,,',^)) = ((F, cp)), \/<peH l (n). 
i.e. ((Au m >,<p)) + A m >(u m >, ip) = ((F, cp)), V<peH l (Q.). 

Taking limits as m' -> +oo we have 

((Au\ (p)) + A* (u* , V ) - ((F ip)), VtpeH 1 (CI) 
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which is the same thing as d3.37t . 

On the other hand, d3-33b for the subsequence becomes 

l/p(A m > - A m >+l)(fi - im'+l) < (1 - ll«m'H 2 )0« - /im'+l), VyUeA. 

Here on the left side p - A m >+\ is bounded indepedent of m' and 
A m > - A m >+i — > as m' — > +oo by d3.36t . On the right side again by 
(I3.36t . p - A m '+\ — > p - A* and (1 - \\u m >\\ ) — > (1 - ||«*|| ) as m' — > +oo. 
Thus we get on passing to the limits 

(3.38) A*eA,(l - \\u*\\ 2 )(p - A*) > 0,VpeA. 

Since u* is a solution of d3-37b . we know on using (I3.25I) . that 

(gradJ*)(A*) = J*(A*) = (1 - || M *|| 2 ). 

Then d3.38t is the same thing as 

A*eA, J*(A*).(p - A*) > 0, VpeA. 

By the results of Chapter |2 (Theorem 12.21 this last variational 
inequality characterizes a solution of the dual Problem A3.2I) . Thus A* is 
a solution of the dual problem. 

Step 10. The sequence u m converges weakly in H l {Q) to the unique 
solution u of the primal problem. 

As in the earlier steps since the sequence u m is bounded in H l (Q.) 
and A m is bounded in R we can find a subsequence m' of integers such 
that 

u m > — »■ in Z/ 1 (O) and /l„/ — > /I* in R. 

We shall prove that (w*,/l*) is a saddle point for the Lagrangian. It 
is easily verified that 181 

(grad v J£(-, A*))(u*) = a(u* ,u*) + A*(u*,u*) - L(u*). 

But the right hand side vanishes because u* is the solution of the 
equation 

Au* + A*Bu = F 
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as can be proved exactly as in Step 4. Moreover Jf(-,A*) is convex 
(strongly convex). Hence by Theorem 1212 .21 

(3.39) S?(u,A*) < &(v,A*),VveH l (a). 
Next we see similarly that 

{grad^iu*, -))(A*) = ^(\\u\\ 2 - 1) 

and ^£(u* , •) is concave. One again using ( 13.381 and the Theorem 1212.21 
we conclude that 

(3.40) ^(u*,fi)<^f(u,A*),\/iueA. 

The two inequalities ( 13.391 and (13.401 together mean that (u* , A*) is 
a saddle point for Jzf '. Hence u* is a solution of the Primal problem and 
A* is a solution of the dual problem. But since / is strictly convex it has 
unique minimum in H l {Q). Hence u - u* and u is the unique weak- 
cluster point of the sequence u m in H l (Q.). This implies that the entire 
sequence u m converges weakly to u in 

Step 11. The sequence u m converges strongly in H 1 (D.) to the unique 
solution of the primal problem. 

We can write using the definition of the functional J: 

J{u) = J{u m ) + a(u m , u — u m ) — L(u - u m ) + — a(u — u m , u — u m ). 
By the coercivity of a(-, •) applied to the last terms on the right side 

J(u m ) + a/2|||w - u m \\\ 2 < J(u) - {a(u m , u - u m ) - L(u - u m )) 
= J(u) + {{Au m - F,u- u m )) 
= J{u) + A m ((Bu m , u - u m )) 



182 since u m satisfies the equation ( (3.20) / | i. i.e. we have 

J(u m ) + a/2|||« - M m ||| 2 < J(u) + A m {u m ,u - u m ). 
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On the left hand side we know that J(u m ) — > J{u) and on the right 
hand side we know that \A m \ and u m are bounded while by Step 5, u - 
u m — (weakly )in H l (Q.). 

Hence taking limits as m — > +oo we see that 

III" — "mill - * as m — > +oo. 
This completely proves the theorem. 

In conclusion we make some remarks on the method of Uzawa. 

Remark 3.3. In the example we have considered to describe the method 
of Uzawa A is a cone in R. But, in general, the cone A will be a subset of 
an infinite dimensional (Banach) space. We can still use our argument 
of Step 3 of the proof to show that A m has a weak cluster point and that 
of Step 4 to show that a weak cluster point gives a solution of the dual 
problem. 

Remark 3.4. We can also use the method of Frank and Wolfe since also 
in this case the dual problem is one of minimization on a cone on which 
it is easy to compute projections numerically. 

Remark 3.5. While the method of Uzawa gives strong convergence re- 
sults for the algorithm to the primal the result the dual problem is very 
weak. 

Remark 3.7. Suppose we consider a more general type of the primal 
problem for the same functional J defined by (I3.U of the form: 

to find ueK, J(u) = inf 7(v) 

veK 

where K is a closed convex by set in V = H Y (Q.) is defined by 183 

K = {v\veH\n),g(v) < 0}. 

with g a mapping of H^Q.) into a suitable topological vector space E (in 
fact a Banach space) in which we have a notion of positivity. Then we 
take a cone A in E as in Remark and <l>(v,//) =< n,g{v) >e'xe- In 
order to carry over the same kind of algorithm as we have given above 
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in the special case we proceed as follows: Suppose A m is determined 
starting from a A eA. We firstsolve the minimization problem 

to find u m such that Jl?(u m , A m ) - inf ££ (v, A m ) 
gradJ*{A m ) = -g(u m ) 

Then we can use Remark EOl to determine A m+ \ : 

A-m+i = P\{A m - pJ*{A m )) = P A (A m + pg(u A )). 

We can now check that the rest of our argument goes through easily 
in this case also except that we keep in view our earlier remarks about 
taking weak topologies in £". For instance, we can use this procedure in 
the cases of convex sets K\,K2, K3 of Remark IXT1 We leave the details 
of these to the reader. 



4 Minimization of Non-Differentiable Functionals 
Using Duality 

In this section we apply the duality method using Ky Fan and Sion The- 
orem to the case of a minimization problem for a functional which is 
not G-differentiable. The main idea is to transform the minimization 
problem into one of determining a saddle point for a suitable functional 
184 on the product of the given space with a suitable cone. This functional 
of two variables behaves very much like the Lagrangian (considered in 
Sectional for the regular part of the given functional. In fact we choose 
the cone A and the function O in such a way that the non-differentiable 
part of the given functional can be written as - sup peA ®(v,ju). It turns 
out that in this case the dual cost function will be G-regular and hence 
we can apply, for instance, the method of gradient with projection. This 
in its turn enables us to give an algorithm to determine a minimizing 
sequence for the original minimization problem. The proof of conver- 
gence is on lines similar to the one we have given for the convergence 
of the algorithm in the method of Uzawa. 
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We shall however begin our discussion assuming that we are given 
the cone A and the function O in a special form and thus we start in fact 
with a saddle point problem. 

Let V and E be two Hilbert spaces and let / : V — > R be a functional 
on V of the form 

(4.1) V3v^ J(v) = -a(v, v) - L(v)eR 
where as usual we assume: 

(i)a(-, •) is a bilinear bicontinuous coercive form on V and 

(4.2) ii)LeV 

Suppose we also have 

(Hi) a closed convex bounded set A in E with OeA, and 

(4.3) (iv) and operator Be^(V, E). 

We set 

(4.4) / x (v) = sup(-(Bv, J u) £ ) 
and 

(4.5) J(v) = J (v) + J,(v). 

Consider now the minimization problem: 

Primal Problem (4.6). To find ueV such that J(u) - inf V£V J(v). We 
introduce the functional Jzf on V x A by 

(4.7) Jz?(v,/z) = J (v) - (Bv,n) E . 

It is clear that if we define 0(v, //) = -(Bv, /j.)e then Jz? can be con- 
sidered a Lagrangian associated to the functional / c and the cone gen- 
erated by A. Since veV the condition that 0(v,yu) < implies veV is 
automatically satisfied and more over, we also have 

<D(v,pju) - -(Bv,p/j.) E = -p(Bv,n) E = p$(v,ju), Vp > 0. 
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On the other hand we see that the minimax problem for the func- 
tional Jz? is nothing but our primal problem. In fact, we have 



We are thus led to the problem of finding a saddle point for 

Remark 4.1. In practice, we are given J\, the non- G-differentiable 
part of the functional / to be minimized and hence it will be neces- 
sary to choose the hilbert space E, a closed convex bounded set A in F 
(with OeA) and an operator fieJzf(V, E) suitably so that 7i(v) = sup^ - 
{Bv,n) E = -M M£A (Bv,/j,) E - 

We shall now examine a few examples of the functionals J\ and the 
correspond E,A, and the operator B. In all the following examples we 
take 

V = R n , E = R m and Be^(V, E) an (m X ri) - matrix . 

We also use the following satandard norms in the Euclidean space 
R m . If 1 < p < +oo then we define the norms: 



(4.8) 



inf sup Jz?(v,yu) = inf(7 (v) + sup(-(Bv,//) £ )) 



= inf J(v). 



in 



\n\ P 



(2>l p ) 1/p 



1=1 



and 



= sup \pi\. 



1 <i<m 



Example 4.1. Let Ai - {peR m : \p\ 2 < 1}. Then 



7i(v) - sup(-(Sv,/i) £ ) = |Sv| 2 . 
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Example 4.2. Let A 2 - {peR m : |ju|i < 1}. Then / x (v) - |fivU If we 
denote the elements of the matrix B by bij then = (bn, • • • , bi n ) is a 
vector in R" and Bv = ((Bv) u ■■■ , (Bv) m ): 

n 

(Bv)i = (bi,v)up - ^ bijVj. 

7=1 

Hence 

n 

/i(v) = max |(Bv)i| = max | V bijVj\. 

\<i<m \<i<m 

7=1 

Example 4.3. If we take A 3 = {peR m ; \p\oo < 1} then we will find 
/i(v) = |fiv|i and hence 

m n 

/i(v) = £i£vji 

1=1 7=1 



Example 4.4. If we take A 4 = {peR m ; \p\oc < L/v. > 0} then we find 
/i(v) - \(Bv) + U where ((fiv) + ),- - 

Hence 



(Bv)/ when (Bv); > 
when (Bv); < 0. 



1=1 7=1 1=1 7=1 



Proposition 4.1. Under the assumptions made on J ,A and B there 
exists a saddle point for Jzf in V x A. 

Proof. The mapping v i-» (v, p) of V — > R is convex (in fact strictly 
convex since a(-, •) is coercive) and continuous and in particular lower 
semi-continuous. The mapping A 3 p i-» (v,p) is concave and continu- 
ous and hence is upper semi-continuous. Let I > be a constant which 
we shall choose suitably later on and let us consider the set 



C/, = {v|v6KHv||v<fl. 
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The set Uf is a closed convex bounded set in V and hence is weakly 
compact. Similarly A is also weakly in E. Thus taking weak topologies 
on V and E we have two Hausdorff topological vector spaces. We can 
now apply the theorem of Ky Fan and Sion to sets U{ and A. We see 
that there exists a saddle point (uf, Af)eU[ x A for Jzf. i.e. We have 
(4.9) 

{u e ,A e )eUc x A,££{u ui i) < ^(u c ,A c ) < &{v,A t ), V(v,fi)eU t x A. 

Choosing // = in the first inequality of ( 14.91 ) we get < -{But, A{)e 
i.e. (Bii£, A[)e < and 

Jo(ue) < J (u e ) - {But, A e ) E < J (v) - (Bv, Ac) E - 

Next, if we take v = OeUg we get 

(4.10) J {ut) < /o(v)(= 0). 

From this we can show that \\u(\\v is bounded. In fact, the inequality 
(14. lOt is nothing but 

-a{u{, ui) - L(u e ) < 0. 
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Using the coercivity of a{-, •) (with the constant of coercivity a > 0) 
a\\u £ \\y < a(u{,u e ) < 2L(u ( ) < 2||L||v'||k^IIv 

(4.11) l e .\\u € \\v<2\\L\\v>fa. 

In other words, \\ue\\y is bounded by a constant c = 2\\L\\y/a inde- 
pendent of (. 

Now we take I > c. Then we can find a ball B(u[, r) = {veV\\\v - 
ut\\v < r] contained in the ball B(0, {). It is enough to take re]0, ^[. 
The functional J attains a local minimum in such a ball. Now J being 
(strictly) convex it is the unique global minimum. Thus we have proved 
that if we choose i > c > where c = 2\\L\\y /a there exists 
(4.12) 

(«, A)eV x A such that Sf(u,fi) < S?(u, A) < jSf(v, A)V(v,}x)eV x A 
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which means that (u, A) is a saddle point for Jz? in V x A. 

Dual problem. By definition the dual problem is characterized by con- 
sidering the problem: 



(4.13) 



to find (u, A)eU x A such that 
sup^ eA inf vey Jgf (v, n) = ££{u, A). 



We write ££{y,[i) in the following form: Since the mapping v h-> 
o(m, v) is continuous linear there exists an element AueV such that 

a(u, v) - (Au, v)v, VveV. 

Moreover, AeJzf (V, V). Also by Frechet-Riesz theorem there exists 
an FeV such that 

L(v) = (F,v)y, VeV. 

Thus we have 

J5f(v,yu) = ^(Av,v)y - (F,v) v - (Bv,ju) £ 

= i(Av,v)y-(v,F + B»y. 

For any /ueA fixed we consider the minimization problem 

(4.14) to find u„eA such that Jz?(m„,//) - inf Jz?(v,//). 

Once again v h-> =Sf (v, [i) is twice G-differentiable and has a gradient 
and a hessian everywhere in V. In fact, 

(4.15) (gradv&^mv) = (Av, <p) v - (F <p) v - (B*/z, 
and 

(Hess v Jf(;n))((p,i/r) = (Aif,,<p) v . 
Hence, the coercivity of a(-, •) implies that 

(Av,v) v = a(v,v) >a||v|£,VveV 
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which then implies that v i-> Jz? (v, fj.) is strictly convex. Then by Theo- 
rem l2l2.2l there exists a unique solution u M of the problem (14. 14ft and 
satisfies the equation 

[grad v &{;n)\^u, = 0. 
i.e. There exists a unique u^eV such that 

veV 

and moreover u M satisfies the equation 

(4. 16) (Au M , <p) v - (B*fi, <p) v - (F, <p) v = 0, VtpeV. 

i.e. 

(4.16) Au tl =F + B*n. 
Thus we have 

(4.17) u M = A~ l (F + B*/j) 
and taking tp = in d4.16t we also find that 

(4.18) (Au fl ,u fl ) v = (F + B*iu,u M ) v . 
using (14. 17ft and (14.181) we can write 

Jz?(^,yu) = -{{Au^u^y - 2{F,u^) v - 2(B*fi, u^y} 

= - 1 -{{F,A-\F + B*v))v + (B*v,A- l (F + fi») v } 
= -^{(BA~ 1 B*fi,/j) E + 2(BA~ l F,fx) E + {F,A- l F) E \ 

since A is symmetric implies A -1 is also self adjoint. Thus we see that 
sup inf (v, /j.) - sup J£(u M ,n) 

iieK veV ' neh 
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= sup --{(BA- l B*n,n) E + 2(BA~ l F,p) E + {F,A~ l F) E }. 
If we set 

(4.19) 8$ = BA l B* and & = -BA l F 
then si eJ^(E.E) and J^eE and moreover 

(4.20) supinf J2f(v,jf) - -- inf{(^,^) £ - 2{&,n) E + {F,A~ l F) E }. 

peA veV L /ieA 

Here the functional 

(4.21) ii » \^H,H)e-(&,h)e 

is quadratic on the convex set A. It is twice G-differentiale with respect 191 
to n in all directions in L and has a gradient G*(p) and a Hessian //*0u) 
every where in A. In fact, we can easily see that 

(4.22) G*(ji) = sfp-&. 



Thus we have provd the following 

Proposition 4.2. Under the assumptions made on J ,A and B the dual 
of the primal problem (4.6) is the following problem: 
Dual Problem. 

(4.23) To find AeA such that J* (A) = inf J*(p), 

HeA 

where 

4 24 l>G") = ^h,h)e - (&,fi)E, 

[£/ = BA~ l B*,^ = -BA~ l F. 

Remark 4.2. In view of the Remark (I3.2t and the fact that g(v) = -Bv 
in our case we know that the gradient of J* is given by G*(ji) = +Bu^. 



188 



5. Duality and Its Applications 



We see easily that this is also the case in pur present problem. In fact, 

G*fi = £/fi- ,9 = BA' i B*p + BA l F = BA~ l {B*fi + F). 
On the other hand, by P~T7t u M = A~\B*p + F) so that 
B Ufl = BA~\B*n + F) = G*(ji). 

Algorithm. To determine a minimizing sequence for our primal 
proble we can use the same algorithm as in the method of Uzawa. 

Suppose A is an arbitrarily fixed point in A. We determine u by 
solving the equation 

(4.25) u eV, Alio - F + B*A . 
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If we have determined A m (and w m -i) iteratively we determine 
the unique solution of the functional (differential in most of the applica- 
tions) equation 

(4.26) u m eV,Au m = F + B*A m 
i.e. u m is the solution of the equation 

(4.26) ' a(u m , (p)-(F + B*A m , cp) v = (F, <p) v + (A m , Btp) E , VipeV. 
Then we define 

(4.27) A m+ \ = P A (A m - pBu m ) 

where Pa is the projection of E onto the closed convex set A and p > 
is a sufficiently small parameter. 

The convergence of the algorithm to a solution of the minimizing 
problem for the (non-differentiable) functional /, / = J + J\, can be 
proved exactly as in the proof of convergence in the method of Uzawa. 
However, we shall omit the details of this proof. 
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Remark 4.3. If we choose the Hilbert space E, the convex set A in E 
and the operator BeJzf ( V, E) properly this method provides a good algo- 
rithm to solve the minimization problem for many of the known non- 
differentiable functionals. 

Remark 4.4. In the above algorithm (I4.26t is a linear system if V is 
finite dimensional, and if V is an infinite dimensional (Hilbert) space 
then ( 14.261 ) can be interpreted as a Neumann type problem. 

Remark 4.5. We can also give an algorithm using the method of Franck 
and Wolfe to solve the dual problem instead of the method of gradient 
with projection. Here we can take p > to be a fixed constant which is 
sufficiently small. 



Chapter 6 



Elements of the Theory of 
Control and Elements of 
Optimal Design 

This chapter will be concerned with two problem which can be treated 
can be using the techniques developed in the previous chapters, namely, 

(1) the optimal control problem, 

(2) the problem of optimal design. 

These two problems are somewhat similar. We shall reduce the 
problems to suitable minimization problems so that we can use the al- 
gorithms discussed in earlier chapters to obtain approximations to the 
solution of the two problems considered here. 

1 Optimal Control Theory 

We shall give an abstract formulation of the problem of optimal control 
and this can be considered as a problem of optimization for a functional 
on a convex set of functions. By using the duality method for example 
via the theorem of Ky Fan and Sion we reduce our control problem to a 
system consisting of the state equation, the adjoint state equation, and a 
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variational inequality for the solution of the original problem. The vari- 
ational inequality can be considered as Pontrjagin maximum principle 
well known in control theory. Inorder to obtain an algorithm we elimi- 
nate at least formally the state and obtain a pure minimization problem 
for which we can use the appropriate algorithms described in earlier 
chapters: 

The theory of optimal control can roughly be described starting from 
the following data. We are given 

(i) A control u, which belogs to a given convex set K of functions K 
is called the set of controls. 

(ii) The state (of the system to be controled) y(u) = y u is, for a given 
ueK, a solution of a functional equation. This equation is called 
the state equation governing the problem of control. 

(iii) A functional J(y, u) - called the cost function - defined by means 
of certain non-negative functionals of u and y. 



j(u) = J(y u , u) 

then the problem of optimal control consists in finding a solution of the 
minimization problem: 



Usually the state equations governing the system to be controled are 
ordinary or partial differential equation. 

The main object of the theory is to find necessary (and sufficient) 
conditions for the existence and uniqueness of the solution of the above 
problem and to obtain algorithm for determining approximations to the 
solutions of the problem. We shall restrict ourselves to the optimal con- 
trol problem governed by partial differential equaiton of elliptic type, 
more precisely, by linear homogeneous variational elliptic boundary 
value problems. One can also consider, in a similar way, the problems 
governed by partial differential equation of evolution type. (See, for 
instance, the book of Lions [31].) 



If we set 



ueK such that 
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1.1 Formulation of the Problem of Optimal Control 

Let Q, be a bounded open set in the Euclidean space R M with smooth 195 
boundary T. We shall denote the inner product and the corresponding 
norm in the Hilbert space L 2 (H) by (•, ■) and || • || while those in the 
Sobolev space V = H l (Q.) by ((•, •)) and ||| • ||| respectively 
We suppose given the following: 

Set of controls. A nonempty closed convex subset K of L 2 (H), called 
the set of controls, and we denote the elements of K by u, which we call 
controls. 

State equation. A continuous, bilinear, coercive form a(-, •) on V i.e. 
there exists contants a a > and M a > such that 



is said to define a state. The system to be governed is said to be governed 
by the state equation dl.21 ). We know, by the results of Chapter EJ that 
for any ueK{<z L 2 (Q) c V) there exists a unique solution y u of (11.2ft . 
Thus for a given / and a given control ueK there exists a unique state y u 
governing the system. 

Cost function. Let b(-, •) be a symmetric, continuous and positive 
semidefinite form on V. i.e. There exists a constant Mb > such that 



(1.1) 




Let feL 2 {Q) be given. 

For any ueK a solution of the functional equation 



(1.2) 



y«ev, 

a(y u , <p) - if, <p) + {u, <p) for all <peV 



(1.3) 



b((p, \fj) = b{\fj, (f) for all tp.if/eV 

\b(fpM ^ Mb\\\<P\\\\\m\\ forah>,<AeV 
B((p, ip) > 0. 
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Further let CeJz? (L 2 (H), L 2 (Q)) be an operator the following condi- 
tions: there exist positive constants ac > and Mq > such that 



(1.4) 



(Cv, v) > a c \\v\\ 2 , for all veL 2 (Q) 
\\Q\ < M c 



Let y g eV be given. We now define the functional 
(1.5) J(y,u) = ^b(y-y g ,y-y g ) + ^(Cu,u) 

Proof of control. This consists in finding a solution of the minimization 
problem: 



(1.6) 



I ueK such that 
\j(y u ,u) = M veK J(y v ,v) 



We shall show in the next section that the problem dl.6t has a unique 
solution. However, we remark that one can also prove that a solution of 
(II ,6b u exists and is unique directly using the differential calculus of 
Chapter ^ and the results of Chapter on the existence and uniqueness 
of minima of convex functionals. 

Definition 1.1. The unique solution ueK of the problem (11.61 is called 
the optimale control. 

Remark 1.1. If the control set K is a convex set described by a set of 
functions denned over the whole of Q. and the constraint conditions are 
imposed on the whole of £1 then the problem (ll.6t is said to be one of 
distributed control. This is the case we have considered here. However, 
we can also consider in a similar way the problem when K consists of 
197 functions defined over the boundary T of Q. and satisfying constraint 
conditions on T. In this case the problem is said to be one of boundary 
control - For example, we can consider 

if i — > ^ utpdcr 

denned on a suiteble class of functions ip on T. 
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Remark 1.2. If we set 

j(u) = J(y u , u) 

then the problem of control is a minimization problem for the functional 
u h> j(u) on K. 

Remark 1.3. Usually the state equation governing the system to be con- 
troled are ordinary differential equations or partial differential equation 
or linear equations. (See the book of Lions [31 1). 

Remark 1.4. We have restricted ourselevs to systems governed by a lin- 
ear homogeneous boundary problem of Neumann type with distributed 
control. One can treat in a similar way the systems governed by other 
homogeneous or inhomogeneous boundery calue problems; for instance, 
problems of Dirichlet type, mixed case we necessarily have inhomoge- 
neous problems. 

Remark 1.5. In practice, the operator C is of the form al where a > 
is a small number. 

1.2 Duality and Existence 

We shall show that there exists a unique of the optimal control prob- 
lem dl.6t . We make use the existence of saddle point via the theorem 
of Ky Fan and Sion (Theorem 1 1.21 of Chapter |5j for this purpose. This 198 
also enables us to characterize the solution of the optimal control prob- 
lem dl.6t . As in the earlier chapters we also obtain the dual problem 
govergned by the adjoint state equation. 

We consider the optimal control problem as a minimization problem 
for this purpose and we duality in the vaiable y keeping u fixed in K. 

We take for the cone A the space V = H l (Q.) it self define the func- 
tional 



(1.7) 



O : V x A R 



by setting 



(1.7)' 



®(y, u, q) = a(y, q) - (J ' + u, q). 
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It is clear that <1> is homogeneous of degree in q: 

(£>(y, u = Aq) - /KD(y, u[q) for all A > 0. 

Next ®(y, u; q) < for all qeA if and only if u e K and y, u are 
related by the state equation dl.21 ). In fact, the state equation implies 
that <J>(y, u; q) - 0. Conversely, <&{y, u;q) < implies that ueK and y, m 
are related by the state equation. For, we have 

a(y, q) - {f + u, q) < for all geA 

and since, for any qeA, -qeA also we have 

a(y, -q) ~(f + u, -q) < 0. 

The two inequalities together imply that 

a (y, q) = (f + u, 1) for all qeA = V = H l (A), 

199 which means that y = y u = u(u). We introduce the Lagrangian S£ 
associated to the minimization problem by setting 

(1.8) &(z,v\q) = J(z,v) + ®(z,v;q). 

More explicitly we have 
(1.8)' 

(z, v; q) = \b{z - y g , z - y g ) + \{cz, z) + a(z, q)-(f + v, q) 
for zeV, veK and qeA - V. 

We shall now prove the following theorem: 

Theorem 1.1. There exists a saddle point for ££{z, v; q) in V x K x V. 

In other words, 
(1.9) 

Theorem exists (y, u; p)eV x K xV such that 
Jzf (y, u\ q) < Jzf (>•, u; p) < ££{z, v\ p)for all (z, v; q)eV xKxV. 
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Proof. The proof will be carried out in several steps. 

Step 1. (Application of the theorem ofKy Fan and Sion). Let t > 
be a constant which we shall choose suitably later. Consider the two sets 



It is clear that Ag = Ug is a closed convex and bounded set in V. 
Since K is closed and convex Kg is also a closed convex subset of L 2 (Q). 
Hence, for the weak topologies V and L?(Q.) are Hausdorff topological 
vector spaces in which Ug, (respectively Kg) is compact. 

On the other hand, since, for every (z, v)eU{ x Kg , the functional 200 

UgBq^ jSf (z, v : q)eR 

is linear and strongly (and hence also for the weak topology on V) con- 
tinuous it is concave and upper semi-continuous (for the weak topology 
on V). The mapping 



is strongly continuous and hence, in particular, (weakly) lower semi- 
continuous for every fixed qeAg = K(. Since the bilinear forms a{-, •), 
b(-, •) on V and (C-, •) on L (SI) are positive semi-definite and v h-» (v, q) 
is linear it follows from the results of Chapter^iElthat the mapping 



is convex. 

Thus all the hypothesis of the theorem of Ky Fan and Sion (Theo- 
rem 11.21 of Chapter |5jl are satisfied. Hence there exists a saddle point 

(yg, uf, pt)eUi x K( xU{ This is the same as saying 



(1.10) 



A/ = U e = {z.\zeV = H l 
K c = {v\veK : ||v|| < €}. 



(O); IHzlll < and 



U ( xK e 3 (z, v) JSf (z, v; q)eR 



(z,v)»3?(,v;q) 



(1.11)' 



there exists (ye, uf, pe)eUe X Kg x Ag such that 
J(y e , u e ) + <b(y e , u e ; q) < J(y e , u e ) + <b(y e , u t \ p e ) 



< 7(z,v) + <D(z,v : pi) 
for all (z, v; q)eUg xKgX A c . 
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Choosing I > sufficiently large we shall show, in the following 
steps that ye, ug, pi are bounded independent of the choice of such an i. 

Step 2. U{ is bounded. In fact, the second inequality in ( (l.ll)'| i 
means that the functional 

(z, v) i-> Jzf (z, v; pi) 

on U{ x K( attains a local minimum at (ye, U(). But since this functional 
is convex, by Lamma lXTl of Chapter^ it is also a global minimum, i.e. 
We have 



(1.12) 



££(y t , ut, qe) < ^(ye, u t \ pi) < ££{z, v; pi 
for all zeV, veK and qeA[ = U{. 



Now we fix a veK arbitrarily and take q = 0, z = y v in (1.11) / and 
we obtain 

Jiye, u t ) < J{y e , u t ) + <$(y e , u c ; pi) < J(y v , v) = j(v). 

It follows from this that, for any fixed veK, we have 

(1.13) 0(ye,u e , p { ) >0 and J(y e ,u/;) < J(v). 

But by dl.3b . (11.4ft the latter inequality in (11.131) implies that 
1 2 

-a c \\uc\\ < J(y t , u e ) < j(v). 
which means that U{ is bounded: 



(1.14) 



\\ut\\ < c\,c\ = 2a c l j(v) 



Step 3. yi is bounded. As before we fix a veK and take z = y v , 
q = e\\\y £ \\\- l eU{ = Af in |(1.11/| (We may assume that y { + 0, for 
otherwise there is nothing to prove). We get 

J(y ( ,u e ) + t\\\ye\\\~ 1 ®(ye,u e ,y e ) < j(v) 

because of the homogeneity of O in the last argument. Here Jiyt, ui) > 
because of dl.31 . (I1.4I > and dl.5t so that we get 

A\\yAT l ®(yt,ue,yi) < j(v). 
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By the coercivity (ll.lt of a(-, •) on V we have 

ffJIMII 2 < a(ye,ye) 
and by the Cauchy-Schwarz inequality we have 

\(f + u ( ,y e )\ < 11/ + u e \\M\ < (11/11 + IMDIIMII- 
Hence using (11.141 

^ a \\\y e \\\<m + mye\\\-\f + u t , ye ) 
< ;(v) + Wll + IMI) 

< m + '(ii/ii + co 

so that, first by dividing by I, we see that if i > 1 then 

(1.15) IIMII < ff^C/Xv) + 11/11 + Ci) = c 2 . 

Step 4. pi is bounded. For this we recall that, as has already been 
observed, (ye, U() is a global minimum for the convex functional 

g : (z,v) i-» J<?(z,v;pe) 

on V x A". Hence, by Theorem |2 11.31 the G-derivative of g at (yj, U() 
should vanish: 

g'(y e , u e ; <p,v) = for all (tp, v)eV x K. 
This on calculation of the derivative gives 

b(yt - y g , <p) + {Cu e , v) + a(ip, p { ) - (f + u e , <p) = 
for all {<p,v)eVxK. 

Taking ip = pe and v = ut we get 203 

(Cw^, w£) + aO;, p^) = {f + u e , pe) - b(y { - y g , pi). 
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Using the coercivity of the terms on the left side and Cauchy - 
Schwarz inequality for the first term on the right side together with the 
continuity of b{-, •) we find that 



a a \\\Pt\\\ 2 < tfclMI 2 + a a \\\Pet < 11/ + ue\\\\Pt\\ + M b \\\y t - y g \\\\\\pt\\\ 
< (11/11 + \\ut\\ + M b \\\yt - y g \\\)\\\pe\\\ 
(ll/H + cy + M b c 2 + M b \\\y g \\\)\\\p e \\\ 



Step 5. We now choose £ > max(ci,C2,2c3, 1) and use the sets U( 
and K[ for the application of the theorem of Ky Fan and Sion. 

Step 6. To show that ye = y U( (i.e. y% is the solution of the state 
equation corresponding to the control U{eK.) For this purpose we have 
to show that 



We already know from (11.131 that ®(ye, uf, pi) > 0. Since q = 
2pceA = V satisfies 



which implies that there exists a constant C3 > such that 



(1.16) 



\\\Pf\\\<c 3 . 



(1.17) 



0>{y e , u e ; q) = for all qeA = V 



IIMII - 2|||p,||| < 2c 3 < I 



we can take q = 2pt in the first inequality of (1.11)' and get 



20(ye, ut\ pi) < <&(yt, u e , pit- 



so that we also have 



(1.18) 



<I>(ye,ue;pi) < 
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Then it follows once again from the first inequality of 
(1. 19) <b(yt, u t \ q)<0 for all qeA { - U { 



that 
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If q $. U( then £\\\q\\\ l qeUe which on substituting in d!.19t gives 



Finally, combining the facts d!.12t and d!.17t together with the def- 
inition of J?f (z, v; q) we conclude the there exists a saddle point (y, u; p) 
in V x K x V. This completes the proof of the theorem. 

The theoem ( ll.lt implies that (y, u) is the solution of the primal 
problem and p is the solution of the dual problem. The equation d!.17t 
is nothing but the fact that y is the solution y u of the state equation. 

From the above theorem we obtain the main result on existence (and 
uniqueness) of the solution to the optimal control problem and also a 
characterization of this solution. For this purpose, if we choose v - u 
in the second inequality of dl.9l ) we find that yeV is the minimum of the 
convex functional 

h : V s z h-> ££{z, u; p)eR. 
Hence taking the G-derivative of h we should have 

h'(u, iff) = b(y -y g ,ip) + a(if>, p) = for all tpeV. 
Thus we see that p satisfies the equation 
(1 .20) a{i/j, p) = -b(y -y g ,i//) for all ipeV. 

The equation (11.201 1 is thus the adjoint state equation in the present 
problem. Again, in view of the hypothesis dl.ll) and dl.3l) it follows (by 
the Lax-Milgram lemma) that, for any given yeV, there exists a unique 
solution peV of the wquation (II .20ft . 

Now consider the functional 

k:Ksv^ Sf(y,v;p)eR. 

The secone inequality in il.91 with z, = y implies that this functional 
k is minimum at v = u. Again taking G-derivatives we have 



k'(v, w) = (Cv, w) - (w, p) for all weK. 
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The solution of the minimization problem for k on K is, by theorem 
!2.2l of Chapter EJ characterized by 

(ueK such that 
k'(u, v - u) > for all veK, 
which is the same as the variational inequality 



(1.21) 



ueK such that 



(Cu, v - u) - (p, v - u) > for all veK. 



The above facts can now be summarized as follows: 

Theorem 1.2. Suppose given the set K of controls, the state equation 
M.2\ and the cost function J defined by M.5\ such that the hypothesis 
il.lt . ( 17.31 1 and M.4t are satisfied. Then we have the following: 



( i) The optimal control problem \1.6t has a unique solution ueK. 

( ii) The unique solution u of the optimal control problem us charac- 
terized by the coupled system consisting of the pair of equations 
M.2\ and M.2(M defining the state y and the adjoint state p gov- 
erning the system together with the variational inequality ill. 2 It . 

( Hi) A solution (y,u;p) to M.2\ . M.2(M and M.21\ exists ( and is unique ) 
and is the unique saddle point of the Lagrangian Jzf defined by 

dm 

Remark 1.6. The variational inequality dl.21i is nothing but the well 
known maximum principle of Pontrjagin in the classical theory of con- 
trols. 



1.3 Elimination of State 

In order to obtain algorithm for the construction of approximations to 
the solution of the optimal control problem (II ,6b we use the characteri- 
zation given by Theorem M.2\ (ii) to obtain a pure minimization prob- 
lem with constraints. This is achieved by eliminating the state y u which 
occurs explicitly in the above characterization. 
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We can rewrite the problem of control dl.6t in terms of the operators 
defined on V by the bilinear forms a(-, •) and b(-, •) and the operator 
defined by the inclusion mapping of V = H l (Q.) in L 2 (Q.). 

In fact, for any fixed yeV, the linear form 

<p i — > a(y, (p) 

is continuous linear on V by (11.1b and hence by Riesz-representation 
theorem there exists a unique element AyeV such that 

(1.22) a(y, <p) = ((Ay, <p)) for all ^eV. 

Once again from (ll.U the mapping y i-> Ay is a continuous linear 
operator on V. Similarly, by dl .3i there exists a continuous linear oper- 
ator B on V such that 

(1.23) b(y, ip) = ((By, <p)) for all <peV. 

Finally since the inclusion mapping of V in L 2 (H) is continuous 
linear it follows that for any ueL 2 (Q) the linear mapping v h-> (m, v) on V 207 
is a continuous linear functional. Hence there exists a continuous linear 
operator D : L 2 (Q.) — > V such that 

(1.24) (k, v) = ((Du, v)) for all weL 2 (Q), veV. 
The state equation can now be written as 

((Ay, ip)) = ((Df + Du, tp)) for all tpeV. 
which is the same as the operational equation in V: 

(1.25) Ay = Df + Du. 

In view of the well known result of Lax and Miligram we have 

Theorem 1.3. Under the hypothesis Al.lt the state equation \1.2t (or 
equilvalently \1.25t ) has a unique solution y u eV for any given ueL (D) 
and there exists constant c > such that 



(1.26) 



IIMII < c(|||z>/||| + III^IID- 
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This is equivalent to saying that the operator A is invertible, A is 
a continuous linear operator on V and (11.261) gives an estimate for the 
norm of A" 1 . Hence we can write 



as the solution of the state equation. 

Next we shall reduce the optimal control problem ( 11.61 ) to a mini- 
mization problem as follows. We substitute y u given by M.211 in the 
cost function dl.5t and thus we eliminate the state from the functional 
to minimize. Using (ll.23t together with (11.271) we can write 

b(y u - y g ,y u - y g ) = ((B(y u - y g ),y u - y g )) 



= {{B[A-\Df + Du) - y g ],A~\Df + Du) - y g )) 
= ((BA- l Du,A- l Du)) + 2{{B{A~ l Df -y g ),A~ l Du)) 

+ ((B(A~ l Df-y g ),A- [ Df-y g )) 
= {{A~ u BA~ l Du,Du)) + 2((A~ 1 *B(A~ 1 D/-^), 
Du)) + G{f,y g ) 



208 where A l * is the adjoint of the operator A 1 and G{f,y g ) denoted the 
functional 



which is independent of u. Once again using dl .24ft we can write 



b(y u -y g ,y u -y g ) = {A- u BA- l Du,u) + {A- U B{A- l Df-y g ),u)+G{f,y g ) 



and hence the cost function can be written in the form 



(1.27) 



y u =A-\Df + Du) 



G(f,y g ) = ((BA~ Df - By g ,A- l Df-y g )) 





We have the following 
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Proposition 1.1. The optimal control problem M.6\ is equivalent to the 
minimization problem: 



(1.29) 



to find ueK such that 
j(u) = mf veK j(v) where 
j(v) = ^v,v)-(^,v) + G(f,y g ). 



We observe that, since the last term in the expression for the quadra- 
tic functional j(v) is a constant (independent of v), ueK is a solution of 
dl.29t if and only if u is a solution of the minimization problem: 



(1.30) 



to find ueK such that 
k(u) = inf V£ K k(y) where 



k(v) = 



v, v) - (J 5 ", v). 



We know by the results of Chapterl2"l§l3*l(Theorem l3. II) that the prob- 
lem dl.30t has a unique solution and it is characterized by the condition 

k'(u,v-u) >0 for all veK, 

where k(-,(p) denotes the G-derivative of &(•)■ This is nothing but the 
variational inequality 



(1.31) 



To find ueK such that 

(£/u - v - u) > for all veK. 



This variational inequality dl.31t together with the state equation is 
an equivalent formulation of the characterization of the optimal control 
problem given by Theorem dl.21 ) (ii). More precisely, we have the fol- 
lowing 

Theorem 1.4. The solution of the optimal control problem ( 17.61 ) is char- 
acterized by the variational inequality: 



(1.32) 



To find ueK such that 
(Cu - p u ,v — u) > Ofor all veK 



209 



206 6. Elements of the Theory of Control and... 

where p u is the adjoint state. 

Proof. We have by the definitions dl -28l > of srf and & 

srfu-& = A~ u B{A' l Du + A' l Df - y g ) + Cu 
which on using the state equation (ll.25t becomes 
(1.33) siu- & = A' l *B(y u -y g ) + Cu. 

□ 

If we now define p u by setting 

(1-34) -p u - A~ u B(y u - y g ) 

then we see that p u satisfies the functional equation 

((A* Ph. -A)) = -((B(y» - y g ), «A)) for all ijjeV. 

We notice that this is nothing but the adjoint state equation: 

a{\jx, p u ) = -b(y u -y g ,tfr) for all if/eV. 

Thus if, for a given control ueK,y u is the solution of the state equa- 
tion then p u defined by (II .34l > is the solution of the adjoint state equation. 
Moreover, we have 

(1.33)' srfu-& = Cu- Pu . 

Substituting (1.33) 7 in the variational inequality dl.31t we obtain the 



assertion of the theorem. 

We are thus reduced to a pure minimization problem in K for which 
we have known algorithms. 
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1.4 Approximation 

The formulation of the optimal control problem as a pure minimiza- 
tion problem given above in Section dl.3t together with the algorithms 
211 described in earlier chapters for the minimization problem will imme- 
diately lead to algorithm to determine approximations to the solution of 
the optimal control problem dl-6l >- Hence we shall only mention this 
briefly in the following. 

We observe first of all that the operator stf is L 2 (f2)-coercive and 
bounded. In fact, in view of dl.24t and dl.23t we can write 

{A~ u BA~ x Du, u) = ({A~ u BA~ l Du,Du)) 

= (BA^Du,A' l Du) = b(A~ l Du,A~ l Du) > 0. 

Since we also have (Cu,u) > ac\\u\\ 2 we find that is L 2 (Q)- 
coercive and 

(1.35) (sfu, u) = (A~ l *Ba~ l Du6Cu, u) > a c \\u\\ 2 , ueV. 
To prove that is bounded we note that A -1 is the operator 
L 2 (Q.) 3 f + u i — > y u eL 2 (Q) 
defining the solution of the state equation: 

(y u eV such that 
a(y u , tp) = ((Ay u , if)) = ((£>(/ + u), 0) for all <peV. 

Here taking <p = y u and using the coercivity of the bilinear form 
a{-, •) we see that 

a a \\\y u \\? <\\\Df + Du\\\\\\y u \\\ 

and hence 

IIMI < \\Vr\Df + Du)\\\ < a~ l \\\Df + Du\\\. 
which implies that A~ ! is bounded and in fact, we have 
(1-36) WA-'W^v) < a~ l . 
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Now since all the operators involved in the definition of g/ are linear 
and bounded it follows that is also bounded. Moreover, we also have 

ll^ll^(L 2 (n),L 2 (n)) = ||A~ *BA~ D + C\\^ {L 2 (a) L 2 m 

1 2 

< llA-Mi^^iiflii^v^iiii^ii^^v) + \\c\y (L 2 \\(m,L 2 (Si)) 

and hence (since \\D\y (L 2 (Qm = 1) 

(1.37) \W\\^(LHd),mm ^ ^ 2 M b + M c . 

212 We are now in a position to describe the algorithms. 

Method of contraction. We recall the the solution of the optimal con- 
trol problem is equivalent to the solution of the minimization problem 
( 11.291 and that the solution of this is characterized by the variational 
inequality (11.311 1: 

ueK such that 

(srfu - v - u) > for all veK. 

We can now use the method of contraction mapping (as is standard 
in the proof of existence of solutions of variational inequality - see, for 
instance, Lions and Stampacchia [ ] ) to describe an algorithm for the 
solution of the variational inequality 11.311 . 

Algorithm. Suppose we know an algorithm to calculate numerically 
the projection P of L 2 (Q.) onto K. Let p be a constant (which we fix) 
such that 

(1.38) < p < 2a~ c Y /(a' a 2 M h + M c ) = 2a 2 Ja c {M b + a 2 a M c ). 

Let u eK be arbitrarily chosen. Suppose u , ■ ■ ■ , u m are determined 
starting from u . We define u m+ \ by setting 

(1.39) u m+1 = P<D( Mm ) 
where 

(1.39)' $>{u m ) -u m - pstfu m + p 2 ,^ 
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We can express ®(w m ) in terms of the operators A, B, C and the data 
/ and y g as follows: 

(1.40) <D(w m ) = u m - p{A- l *BA- y Du m + Cu m )+ p 2 A- u B{A~ y Df -y g ). 
The choice ( 11.381 ) of p implies that the mapping 

(1.41) T : K 3 w i-> P<!>(w)eK 

is a contraction, so that T has a fixed point u in to which the sequence 213 
u m converges. 

Method of gradient with projection. We consider the minimization 
problem for the quadratic functional 

(1.42) vh^(v) = -(^v,v)-(J,v) 

on K. Since srf is coercive, we can use the method of Chapter@J Section 
[5] and we can show that we can choose as convergent choices for p > 
a constant and for the direction of descent 

(1.43) w m = grad^{u m )l\\grad^{u m )\\. 
Thus starting from an arbitrary u eK, we define 

(1.44) u m+ i = P K (u m - pgrad^{u m )l\\grad^{u m )\\) 

where Pk is the projection of L 2 (Q) onto K. 

This method, however, requires the computation of W(u m ) and its 
gradient at each step. For this purpose, knowing u m eK we have to solve 
the state equation: 

j y m eV such that 

! a(y m , (f) = (f + u m , if) for all yeV 
to obtain y m and the adjoint state equation: 
p m eV such that 

a(ip, p m ) = -b(y m - y g , if) for all <peV 
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to obtain the adjoint state p m . We can then calculate grad &(u m ) by 214 
using 

(1.45) grod^(u m ) = Cu m - p m . 

We shall not go into details of the algorithm which we shall leave to 
the reader. 

Remark 1.7. This method is rather long as it involves several steps for 
each of which we have sub-algorithms for computations. Hence this 
procedure may not be very economical. 

1.46 

As an illustration of the methods described in this section we consider 
the following two-dimensional optimal control problem: Let Q be a 
bounded open set in R 2 with smooth boundary T. We consider the fol- 
lowing optimal control problem 

c . . i ~ A yu +y u = f + WmQ. 

State equation : < 

I dy u /dn = on T 

where n denotes the exterior normal vector field to T 

Controal set : K = {ueL 2 (Q.)\0 < u(x) < 1 a.e. on O} 

Cost function : J(y, u) = f Q (\y u - yg\ 2 + \u\ 2 )dx. 

We shall leave the description of the algorithm to this problem on 

the lines suggested in this section as an exercise to the reader. 

2 Theory of Optimal Design 

In this section we shall be concerned with the problem of optimal design. 
We shall show that certain free boundary problems can be considered as 
special cases of this type of optimal design problem. We shall consider 
a special case of one-dimensional problem and explain a very general 
method to obtain a solution to the problem, which also enables us to give 
algorithms to obtain approximations for the solution. This method can 
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be seen to be readily applicable to the higher dimensional problems also 
215 except for some technical details. Though there is a certain similarity 
with the problem of optimal control we cannot use the duality method 
earlier used in the case as we shall see later. 

2.0 Optimal Design 

In this section we shall give a general formulation of the problem of 
optimal design. Once again this problem will be considered as a mini- 
mization problem for a suitable class of functionals. As in the case of 
optimal control problem these functionals are defined through a family 
of state equations. We shall consider here the states governing the sys- 
tem to be determined by variational elliptic boundary value problems. 
Though there is some analogy with the optimal control problem studied 
in the previous section there is an important difference because of the 
fact in the present case the convex set L (in our case the set K will be the 
whole of an Hilbert space), on which the given functional is to be mini- 
mized, itself is in some sense to be determined, as it is a set of functions 
on the optimal domian to be determined by the problem. Therefore this 
problem cannot be treated as an optimal control problem and requires 
somewhat different techniques than the ones used before. 

Roughly speaking the problem of optimal design can be described 
as follows: Suppose given 

(1) A family of possible domians Q. (bounded open sets in the Eu- 
clidean space) having certain minimum regularity properties. 

(2) A family of elliptic boundary value problems describing the states, 
one each on a fi of the family in (1). 

(3) A cost function j (described in terms of the state determined by 
(2) considered as a functional of the domian Q in the family). 

Then the problem consists in finding a domian Q* in the given family 216 
for which is a minimum. 

We shall describe a fairly general theory to obtain a solution to the 
optimal design problem. In order to simplify the details we shall, how- 
ever, describe our general method in the special case of one dimension. 
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Thus the states governing the problem is described by solutions of a two 
point boundary value problem for a linear second order ordinary differ- 
ential equation. We shall first describe the main formal steps involved in 
the reduction of the problem to one of minimization in a fixed domian. 
We shall then make the necessary hypothesis and show that this formal 
procedure is justified. 

2.1 Formulation of the Problem of Optimal Design 

Let srf be a family of bounded open sets Q in R" and let T denote the 
boundary of Q, Q.es^. We assume that every Q.es^ satisfies some regu- 
larity properties. For instance, every Qe£/ satisfies a cone condition or 
every Qes/ has a locally Lipschitz boundary etc. 

We suppose the following data: 

(1) For each Q.e.$/ we are given a bilinear form 



on V = V n = H l (£l) such that 

(i) it is continuous ; i.e. there exists a constant Mq > such that 

(2.1) a(Q; <p, iff) < M a \\<p\\vU\\v for all <p, tJfeV - H l (Q), Qegf. 

(ii) it is H l (Q) -coercive : there exists a constant Cq > such that 

(2.2) fl(£2; <p, <p) > Ca\\<fi\\ 2 Hl(Ciy for all QeH\Q), Qe^. 

217 Example 2.1. Let Qesnf and 



(2) For each Q.e£/ we are given a continuous linear functional <p i-> 
L(Q.;ip) on H l (Q.),Qe£/. 



V x V 3 (y, if) ^ a(Q; y, <p)eR 
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Example 2.2. Let FeL 2 (B, n ) and / = F\q = restriction of F to Q. 
L(Q; <p)= [ /pd*. for all <peH l (Q), Qesrf . 

Consider the variational elliptic boundary value problem: 

(To find y = ycyeH l (Q.) such that 
a(fi;y, <p) - L(Q; y>), ( for all <peH l {Q.)). 

We know by Lax-Milgram lemma that under the assumptions (1) 
and (2) there exists a unique solution yneH l (Q.) for this problem (I2.3t . 
We observe that since / is given as F\q this solution depends only on 
the geometry of Q., Q.esi ' . 

(3) Cost function. For each £le£/ we are given a functional on 
H\£l) : 

(2.4) H\Q)3 J(Q.;z)eR 

Example 2.3. 

z) = Jf, |z - g| 2 <io-, where 
gey G = Gir.Geff^ft),!!^. 

Example 2.4. 

7(Q; ) = \z - g\ 2 dx, where 
GeL 2 (R") and g = G|Q, Qe.fi/. 

2.5 Example of a family srf of domains. 

Suppose B and o> are two fixed open subsets of W such that to c B. 
Let A be the family of open sets Q. in K.' 1 such that co c Q. c B and £2 
satisfies some regularity property (say, for instance, Q. satisfies a cone 218 
condition). 
Define 



(2.5) 



j(Q) = J(Q;y n ),negf 
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where y& is the (unique) solution of the homogeneous boundary value 
problem ( I2.3I) . 

The problem of optimal design consists in minimizing j(Q.) over $4 : 

(To find Q*eg/ such that 
j{u)*) = inf n ^ 

Optimal design and free boundary problem. Certain free boundary 
problems can be considered as a problem of optimal design as is illus- 
trated by the following example in two dimensions. 

Let r o be a smooth curve in the plane K 2 defined by an equation of 
the form 

(2.7) z(x) - xi - <p(x 2 ) - 0, 

where (p : / = [0, 1] 3 x 2 *-* ip(x 2 )e'K + is a smooth function. Let Q 
denote the (open) strip in R 2 : 

(2.8) Q = {x = (x u x 2 )eM 2 \x 1 > 0,0 < x 2 < 1}. 
Consider the open set O given by 

(2.9) Q - {xeQ\z{x) < 0} = {x = (x u x 2 )eQ\xi < f(x 2 )}. 

The boundary T of Q decomposes into a union £ UT with £° nT° = 

<f>. 

There exists a one-one correspondence between Q. and the function 
z, Thus the family srf is determined by the family of smooth functions 

z : Q -» R. 

Let us consider the optimal design problem: 
(2.10) 

a(Q.;y,<p) = (y, tp) H ^a), for y,yeH l {£l); 
■ L(Q; <p) = (f, tp) L i m , for <peH l (D) where / = F\Q, FeL 2 (M, 2 ) 
/(Q; z) - J T \z(x)\ 2 dcr, where dcr is the line element on T . 
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Then y = y& is the unique solution of the Neumann problem: 
(2.11) 



yneH\n) 

On, <p)hHci) = (/> 0>)z 2 (Q) f or all <peH\Q.) 



and 



(2.12) j(Q) = J(Q;y n ) = \ya(.x)\ z dtr. 

The optimal design problem then becomes 
(2.13) 

To find Q* such that j(H*) < j(Q.) for all Oe^ In other words, 



(2.13)' 



To fin y&eH l (n*) such that 
JL |jn* (x)\ 2 dcr is minimum 



Suppose now that inf^^ - = 0. The it follows that 
(2.14) y a . = a.e. on H 

In this case, the optimal design problem reduces to the following so 
called "free boundary problem" : 

To find a domian Q.*e£/ whose boundary is of the form T* = £ UTJ 
where £ is a fixed curve while T* is a curve determined by the solution 
of the homogeneous boundary value problem 



(2.13)' 



-Ay + y = f in £2* 
dy/dn - on £ 
d;y/% = 0,;y = 0on T*. 



This equivalent formulation is obtained in the standard manner from 
the state equation 02.3b using the Green's formula together with the con- 
dition (12. 14ft . Free boundary problems occur naturally in many contexts 
- for example in theorey of gas dynamics. 
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2.2 A Simple Example 
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We shall illustrate our general method to obtain approximations to the 
solution of the optimal design problem for the following one dimen- 
sional problem. 

Let si denote the family of open intervals 



(2.15) 



Q. a = (0, a), a > 1 



on the real line. 

State equation. Assume that an feL 2 (R l ) is given. The state gov- 
erning the system is a solution of the following problem: 



(2.16) 



To find yn a eH l (Q. a ) = H\0, a) such that 



a(Q a ;ya a ,(p) = J 



a r (dy Qa dip 
dx dx 



+ yn a <p\dx 



= / ftpdx = L(Q a ;(p), for all <peH\n a ). 
o 



On integration by parts (or more generally, using the Green's for- 
mula) we see that this is nothing but the variational formulation of the 
two pointy boundary value problem (of Neumann type boundary value 
problem): 



(2.16)' 



To find ya a eH l (Q. a ) satisfying 

-T^2~ + yO, a = f m &a 

dyn dyn 
-4^(0) - - -^(a) 

dx 



dx 



Cost function. Suppose given a geL 2 (0, 1). Define 



(2.17) 



j(a) 



= \l \y« a - 



g\ 2 dx. 



Problem of optimal design. 

j To find a* > 1( i.e. to find Q* - Q a *) such that 
I j(a*) < j(a) for all a > 1. 



(2.18) 
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Remark 2.1. It appears natural to consider a as the control variable and 
use the duality argument as we did in the case of the optimal control 
problem. However, since the space V = H 1 (Q. a ) varies with a the duality 
method may not be useful to device algorithms. 

In what follows, we shall adopt the following notation to simplify 
the writing: 



2.3 Computation of the Derivative of j. 

We shall use the method of gradient to obtain algorithms to construct ap- 
proximations converging to the required solution of the problem (2.18). 
In order to be able to apply the gradient method we make the formal 222 
computation of the gradient of j (in the present case, the derivative of f) 
with respect to a in this section. We justify the various steps involved 
under suitable hypothesis in the next section. 
Settinf for tpeH\Q. a ) 

(2.20) F(a, x) - y'(a, x)<p'(a, x) + y(a, x)tp(a, x) — f(a, x)<p(a, x) 

we can write the state equation (T2.16t as 



Here since we have a Neumann type boundary value problem for a 
second order ordinary differential operator the test function tp belongs to 
H l (Q. a ) and so tp is defined in a variable domian Q. a = (0, a). This may 
cause certain inconveniences, which however can easily be overcome be 
overcome as follows: 

(1) We can take tp to be the restriction to Q. a of a function ifreH 1 
(0, +oo) and write the state equation as 



(2.19) 



ya a (x) =y(a,x) 
dy/dx(a, x) = y'(a, x) 
dy/da(a, x) = y a (a, x) 



(2.21) 
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Such a choice for the test functions <peH (£l a ) would suffice when 
the state is described by a Neumann type problem (as we have in 
the present case.) But if the boundary conditions are of Dirichlet 
type this choice is not suitable since the restrictions of functions 
in H l (0, +00) to Q. a do not necessarily give functions in the space 
of test functions Hl(Q. a ). We can use another method in which 
such a problem do not arise and we shall use this method. 

(2) Suppose ipeH m {Q.\), Qi(0, 1) and m > 2. Then the function x i-> 
<p(a, x) defined by 



is well defined in Q a and belongs to H m (Q. a ) ^> H\Q. a ). (This 
inclusion, we note is a dense inclusion.) We also note that, in this 
case, if Ye/Zf (Q.\) then (peH™{Q. a ) and conversely. 

Thus we set 

(2.20) ' F(a, x) = y'(a, x)i//(x/a) + (y(a, x) - f(x))i//(x/a) for i]jeH m {Q.i) 
and we can write the state equation with this F as 

(2.21) K(a)= f F(a,x)dx = 



We shall make use of the following classical result to calculate the 
derivative dK/da. 

Let A denote the closed subset of the (x, a)-plane: 



(2.22) 



<p(a,x) - i//{x/a) 




(2.23) 



A - {(x, a)eE, 2 ; a > 1 and < x < a}. 



Suppose F : A — > R be a function satisfying: 
Hypothesis (1). For every a > 1 , the real valued function 



x i-> F(a, x) 



is continuous in < x < a. 
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Hypothesis (2). For every xe[0, a], the function 

a i — > F(a, x) 

is differentiable and dF Ida : A — » R is continuous. Then the integral 

r 

K(a) - I F(a,x)dx 
Jo 

exists, a i-> ^f(a) belongs to C ! (l < a < +oo) and we have 

(2.24) — (a) = I dF/da(a,x)dx + F(a,a) 

da Jo 



Remark 2.2. We observe that this classical result has a complete ana- 
logue also in higher dimensions and we have a similar identity for 
grad a K (with respect to a) in place of dK/da. 



Now differentiating the equation (2.20)' with respect to a and using 
the above result we get 



dK/da(a) - I dF/da(a, x)dx + F(a, a) 
Jo 

- I [{)£(a,x)^'(x/a) +y a (a,x)iff(x/a)}+ 
Jo 



+ {/(a, x)(i//(x/d))a + y(a, x)(iff(x/d)) a - f(x)(ifr(x/a)) a }]dx 
+ \y'(a, xyV'{xla) + yia, x)i{/(x/a) - f{x)ip(xla)] x=a = 0. 

We observe that, if m > 2 then x i-» (iff(x/a)) a eH (0, a). In fact, 

(<Kx/a)) a - (-x/a 2 )^(x/a)eL 2 (O fl ), 

((A(x/a))^ - (-l/a 2 )if,'(x/a) + {-x/a 3 )ifr"{x/a)eL 2 {Q a ). 

where ip' and ifr" are (strong) L 2 -derivatives of ip, which exist since 
i/,eH 2 (0, 1). 

Hence by the state equation ( I2.16t we find that 

I {y'(a, x){if/'{x/a)) a + y{a, x){if/{x/a)) a - f{x){if/{x/a)) a }dx 
Jo 
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- a{Q a ;ynJ{i{f(x/a)) a - L(£l a ;(i//(x/a)) a ) - 



Thus we conclude that 



{y' a (a, xy¥'(xla) + y a (a, x)t//(x/a)}dx 



o 



(2.25) - — \y'{a, x)tj/'{xla) + y(a, x)t]/{xla) — f(x)ij/(x/ci)] x=a , 
for all i//eH m (0, 1) with m > 2. 

225 Remark 2.3. It is obvious that the above argument easily carries over to 
dimensions > 2 of rhte computation of grad a K(a). 

Finally, we calculate the derivative of the cost function j with respect 
to a and we have 



In ( 12.261 ) we eliminate the derivative y a of the state vn„ using the 
adjoint state equation. The adjoint state pQ a - p(a, x) is the solution of 
the equation: 



If we know that y(a, x) is sufficiently regular, for instance say, y a eH 
(fl fl ) then taking ip = y a (a, x) in the adjoint state equation (12.27ft above 
we obtain 



(2.26) 




(2.27) 



£{<p'(x)p'(a, x) + <p(x)p(a, x)}dx = J (y(a, x) - g(x))<p(x)dx, 
for all <peH\0, a). 




bf (2.27) 



This together with (2.25) for \p = p gives 



(2.28) dj/da - —\y'{a, x)p'(a, x) + y(a, x)p(a, x) — f(x)p(a, x)] 



lx=a 
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2.4 Hypothesis and Results 

In the calculation of the derivatives of the cost function j{d) in the pre- 
vious section we have made use of the regularity properties of the state 
yn a = y(a, x) as well as that of the adjoint state pq o = p(a, x) with re- 
spect to both the variables x and a. This in turn implies the regularity 226 



of the function F(a, x) define by (2.20)' which is required for the va 



lidity of the theorem on differentiation of the integral K(a) of F(a, x). 
The regularity of y(a, x). The regularity of y(a, x) and p(a, x) are again 
necessary in order that the expression on the right side of (I2.28t for the 
derivative value problmes for (ordinary) differential equation, the regu- 
larity of y and p as a consequence of suitable hypothesis on the data / 
and g. 

We begin with the following assumptions on the data: 
Hypothesis (3). For all a > 1, t i-> f(at)eH l (0, 1). 
Hypothesis (4). geH l (0, 1). 
Then we have the following 

Proposition 2.1. (Existence of the derivatives y a and y' a ). Under the 
hypothesis (3) onf, ifyia, x) is the solution of the state equation \2.16\ 
then 

(i) y(a, x)eH 3 (0, a) 

( ii) y a exists and x h-> y a (a, x)eH 2 (0, a) and as a consequence we have 
(Hi) x t-* y(a, x)eC 2 ([0, a]) and 

x h-> y a (a, x)eC 1 ([0, a]). 
Proof. By a change of variable of the form 

(2.29) x = at, xe(0, a) and te(0, 1) 



we can transform the state equation ( I2.16t to a two point boundary value 
problem in the fixed domain Oi = (0, 1). Under the transformation 
( I2.29t we have the one-one corresponding between y and u given by 



(2.30) y(a, at) = u(a, t), u(a, x/a) - y(a, x) 
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and for m > 1 we have: 

(2.31) x ^ y(a, x)eH m (0, a) if and only if 1 1-> u(a, t)eH m (0, 1) 

Similarly if <peH m (0, 1) then 

(2.32) x ^ ^{a, x) = <p(x/a) = <p(t)eH m (0, a) 
and conversely. Moreover, we also have 



(2.33) 



y'(a,x) = a 1 du/dt(a,x/a) = a l u t (a,x/a) 
if/'(a,x) = a~ X ip t {xja), 



so that the state equation can now be written as 
(2.34) 

IJ^{a^ 2 u t (a, x/a)(p t (x/a) + u(a, x/a)(p(x/a) - f(x)tp(x/a)}dx = 
for all <peH m (0, 1). 



(2.34)' 



By the transfomation ( 12.291 this becomes 

[ ^{a~ 2 u,(a, t)<p t (t) - (u(a, t) + f(at))<p(t)}dt - 
1 for ah>e// m (0, 1). 



Since h m (0, 1) is dense in H l (0, 1) (for any m > 1) it follows that 
(2.34/ is valid also for any ipeH l (0, 1). This means that t h> u(a, t) is a 



solution of the two point boundary value problem 
(2.34)" 



;/ = u(a, t) 
cPu/di 2 + u = f(af) 
u t (a, 0) = = u t (a, 1) 



Since 1 1-> f(af)eH (0, 1) by Hypothesis (3) we know, form the reg- 
ularity theorey for (ordinary) differentail equation, that 



th+u(a, t)eH\0,\) 
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which proves (i). Then by Sobolev's lemma t i-> u(a, t)eC 2 ([0, 1]). It 
follows then that 



(2.35) 



x i — > y(a,x) - u(a,x/a)eC ([0, 1]). 



which proves the second part of (iii). 

In order to prove that y a exists and is regular it is enough to prove 
the same for u a . For this purpose, we shall show the u a satisfies a second 
order (elliptic) variational boundary value problem. 

We note that, by the theorem of dependence on parameters, the so- 
lution of (2.34)" as a functiona of the variable a is differentiable since 
the Hypothesis (3) implies that 



(2.36) 



(df/da)(at) = tf t (at)eL 2 (0, 1). 



Now if we differentiate (2.34/ with respect to a we get 



(2.37) 



f Q {a~ 2 u t>a (a, t)(p t (t) + u a (a, t)<p(f)}dt 
- 2a~ 3 J u t (a, t)(f t {t)dt + J Q f t (at)t<p(t)dt . 
for all <peH m {0, 1). 



Here on the right side the first term exists since t h> u t (a, t)eL 2 (0, 1) 
while the second term exists since t i-> f,(at)eL 2 (0, 1) by Hypothesis 
(3). Now t m> u(a,t)eH 3 (0, 1) implies that u u eH l (0, 1) c L 2 (0, 1) and 
so on integrating by parts we find that 

I u t (a,f)<p t dt = - I u t j(a, f)(p(t)dt + [u t (a, OvWJ^o- 
Jo Jo 



Since u t (a,t) = ay'(a,x) the boundary conditions in (2.34)" 
imply that 



on y 229 



[utiaJMt)]'-^ = [ay'(a,x)<p(x/a)] x x Zo = 0. 
Hence the right side of ( I2.37t can be written as 



(2.38) 



X' 1 

Jo 



-2a 3 u u {a,t) + tf t (at)}(p(f)dt. 
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Since -2a~ 3 u t j(a, t) + tf t (a, t)eL 2 (0, 1) we conclude that u a (a, t) sat- 
isfies a variational second order (elliptic) boundary value problem (12.37b 
with the right hand side ( 12.381) data in L 2 (0, 1). Then by the regularity 
theory of solutions of (ordinary) differential equation it follows that 

(2.39) t i-» u a (a,t)eH 2 (0, 1) 

Then 

(2.39)' y a (fl> x ) - u a {a, x/a) + (-x/a 2 )u t (a, x/a)eH 2 (0, a) 

which proves the assertion (ii). Again, applying Sobolev's lemma to 
y a , the second part of (iii) is also proved. This proved the proposition 
completely. 

We also have the following regularity property for the adjoint state 
p(a, x). 

Proposition 2.2. If satisfies the Hypothesis (3) and g the Hypothesis (4) 
then the adjoint state x \-> p(a, x) belongs to H 3 (0, a) and consequently 
x i-> p(a, x)eC 2 ([0, a]). 

Proof. The adjoint state equation ( I2.27t is transformed by ( 12. 291 ) as fol- 
lows: 

p(a, at) = q(a, t) and ip{a, x) = ip(x/a) 

J^{a~ 2 q t (a.x/d)(ft(x/a) + a(a, x/d)(f(x/a)}dx 
= J (y(a, x/a) - g{x/a))<p(x/a)dx, for all tpeH (0, a) 

230 That is, we have 

(2.40) 

^{a- 2 q t (a, t)<p t (t) + q(a, t)tp(t)}dt = £(u(a, t) - g(t))<p(t)dt. 
for all ipeH 1 (0,1). 

□ 

Since on the right hand side t h> u(a, t) - g(f)e// 1 (0, 1) by Propo- 
sition i2. II above it follows, again by the regularity theory for ordinary 
differential equations, that 

(2.41) t^ q(a,t)eH 3 (0, 1) 
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This is equivalent to saying that 



(2.41)' x^>p(a,x)eH 3 (0,a). 

By Sobolev's lemma it follows that x i-> p(a, x)eC 2 ([0, 1]), com- 
pleting the proof of the proposition. 



Next we verify that F defined by (2.20)' satisfies the required Hy- 
pothesis (1) and (2) for the validity of the calculation of dj/da. 

If we assume that (peH 3 (0, 1) then x i-» (p(x/a)eH 3 (0,a) and then 
by Sobolev's lemma, x h» (p(x/a)eC 2 ([0, 1]) and <p'(x/a)eH 2 (0,a) c 
C^fO, 1]). Hence we find, on using Proposition (12.11) (i) and (iii), that 

(2.42) x i-» F(x,a) = y'(a,x)(p'(x/a)+(y(a,x)—f(x))(p(x/a)eC°([0,d\) 

since we know that feH l (0,a) c C°([0,a]) by Hypothesis (3) and 
Sobolev's lemma. Moreover, differentiating the expression for F with 
respect to a using Proposition ( 12. II ) (ii) and (iii) we see that 

x i — > ^(a, x)(p'(x/a) + y a (a, x)<p(x/a) + y'(a, x)(<p'(x/a)) a 

(2.43) + (y(a, x) - f(x))(cp(x/a)) a eC°([0, a]) 

which proves that F : A — > R satisfies the Hypothesis (1) and (2). This 231 
the expression on the right hand side of d2.28t has a meaning since 

(2.44) y'(fl, x)p'(a, x) + (y(a, x) - f(x))p(a, x)eC°([0, a]) 
and we obtain 

(2.28)' dj/da = -[y'(a,a)p'(a,a) + (y(a,a) - f(a))p(a,a)]. 

Thus we have proved the following main result of this section: 
Theorem 2.1. Under the Hypothesis (3) and (4) on the data f and g the 



cost function a i— > j(d) is differentiable and dj/da is given by (2.28)' 
where y(a, x) and p(a, x) represent the direct and adjoint state respec- 
tively governing the problem of optimal design (2.18). 
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Remark 2.4. The genral method described in this section is not, in gen- 
eral, used for one-dimensional problems since it is not economical to 
compute dj/da which in turn involves computations of y and p, and 
their derivativex (see (2.28)' In the case of one dimensional problems 
other more efficient and simper methods are known in literature. The 
importance of our method consists in its usefulness in higher dimen- 
sions to device algorithms using, for instance, the gradient method. 
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